Back to All Events

ML Benkyoukai: More Agents Is All You Need

A straightforward way to improve the output of LLMs is to just run them many times and take the majority vote. In our discussion of DPO, we noticed that the “Best of 128” baseline performed surprisingly well. AlphaCode 2 get great mileage out of generating millions of programs and then filtering to find the best one. One of our members recommended a recent paper by Li and Zhang et al., which checks how the performance of an ensemble improves as we add more agents across five benchmark datasets.

This month, we’ll use Li and Zhang et al. to frame a discussion around so-called “Babble & Prune” approaches. When can they be deployed? How many agents can you add to the ensemble before you hit diminishing returns? Why does this work at all?

We find that, simply via a sampling-and-voting method, the performance of large language models (LLMs) scales with the number of agents instantiated. Also, this method is orthogonal to existing complicated methods to further enhance LLMs, while the degree of enhancement is correlated to the task difficulty. We conduct comprehensive experiments on a wide range of LLM benchmarks to verify the presence of our finding, and to study the properties that can facilitate its occurrence.

More Agents Is All You Need, Li and Zhang et al. 2024

Previous
Previous
24 April

On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial

Next
Next
7 May

AIIF Masterclass: AI Use