On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial

Wednesday 24 April 2024
19:00 21:00

Google Calendar ICS

The Gladstone AI action plan cites “persuasiveness” as a major source of risk from LLMs. This capability allows LLMs to be used in mass-targeted disinformation and fraud campaigns. But are LLMs actually persuasive? I certainly don’t find GPT all that convincing in my daily interactions.

In a new pre-registered EPFL study, 820 participants were asked to engage in short, multiple-round debates with a live opponent. Each participant was randomly assigned a topic and stance to hold (PRO or CON) and was then randomly paired with either an AI or another human player. Based on their findings, “not only are LLMs able to effectively exploit personal information to tailor their arguments and out-persuade humans in online conversations through microtargeting, they do so far more effectively than humans.”

This week Blaine Rogers will break down the study, examine its methodology, talk about its results, and speculate about what actions we can take to make ourselves and our society robust to LLM persuasion.

In this pre-registered study, we analyze the effect of AI-driven persuasion in a controlled, harmless setting. We create a web-based platform where participants engage in short, multiple-round debates with a live opponent. Each participant is randomly assigned to one of four treatment conditions, corresponding to a two-by-two factorial design: (1) Games are either played between two humans or between a human and an LLM; (2) Personalization might or might not be enabled, granting one of the two players access to basic sociodemographic information about their opponent. We found that participants who debated GPT-4 with access to their personal information had 81.7% (p < 0.01; N=820 unique participants) higher odds of increased agreement with their opponents compared to participants who debated humans. Without personalization, GPT-4 still outperforms humans, but the effect is lower and statistically non-significant (p=0.31). Overall, our results suggest that concerns around personalization are meaningful and have important implications for the governance of social media and the design of new online environments.
— On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial, Salvi and Ribeiro et al. 2024

On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial

AIIF Masterclass: AI Business Landscape

ML Benkyoukai: More Agents Is All You Need

AI Safety 東京