Back to All Events

AIIF Weekly Masterclass: AI Landscapes

Welcome to the AI Industry Foundation’s first Weekly Masterclass.

The AIIF will be running Weekly Masterclasses to bring AI Executives up-to-speed. Our Masterclasses provide a structured distillation of technical and non-technical aspects of cutting-edge AI knowledge, preparing you to have conversations on the frontier of AI and to make business decisions based on those conversations.

For our first session, we’ll be focusing on the AI Landscape. There have been several big releases in the last couple of months. Google’s Gemini Ultra and Anthropic’s Claude 3 both claim to beat OpenAI’s GPT-4 in various benchmarks. Does this mark a big change in large language model capabilities? In this session we’ll go through the press release for Claude 3, look at which claims we should pay attention to and which we can ignore, and ask how benchmark numbers translate to real-world performance.

Today, we're announcing the Claude 3 model family, which sets new industry benchmarks across a wide range of cognitive tasks. The family includes three state-of-the-art models in ascending order of capability: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Each successive model offers increasingly powerful performance, allowing users to select the optimal balance of intelligence, speed, and cost for their specific application.

[…]

Opus, our most intelligent model, outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate level expert knowledge (MMLU), graduate level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits near-human levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.

Introducing the next generation of Claude, Anthropic, 2024-03-04

Previous
Previous
6 March

Elon Musk vs. Open AI

Next
Next
13 March

Fundamental Limitations of Reinforcement Learning from Human Feedback