Back to All Events

Recursive Self-Improvement

As AI systems become more autonomous, the prospect of them improving their own capabilities raises critical questions for safety and control. A recent paper introduces the Darwin Gödel Machine—a self-improving system that iteratively rewrites its own code to develop more capable coding agents. It is inspired by open-ended evolution and guided by evaluations, resulting in notable improvements in several capability benchmarks.

This month, André Röhm will walk us through the key ideas behind recursive self-improvement and the Darwin Gödel Machine. Join us to explore the current state and implications of self-improving models!

Most of today’s AI systems are constrained by human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The scientific method, on the other hand, provides a cumulative and open-ended system, where each innovation builds upon previous artifacts, enabling future discoveries. There is growing hope that the current manual process of advancing AI could itself be automated. If done safely, such automation would accelerate AI development and allow us to reap its benefits much sooner.

Previous approaches, such as meta-learning, provide a toolset for automating the discovery of novel algorithms but are limited by the human design of a suitable search space and first-order improvements. -- We propose the Darwin Gödel Machine (DGM), a novel self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks.

In this paper, the DGM aims to optimize the design of coding agents, powered by frozen foundation models, which enable the ability to read, write, and execute code via tool use. Inspired by biological evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It then samples from this archive and tries to create a new, interesting, improved version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), producing performance increases on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). Overall, the DGM represents a significant step toward self-improving AI, capable of gathering its own stepping stones along a path that unfolds into endless innovation.

Jenny Zhang, et al., Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents (2025)

Previous
Previous
18 June

Gradual Disempowerment