Task intelligence benchmark progress extending into steering intelligence.
A single paper-style timeline from 2020 through years beyond 2025. Benchmark accuracy lines fill the task-intelligence era, and four steering-intelligence capabilities appear after 2025.
Today
Next frontier
Task Intelligence
Steering Intelligence
0
20
40
60
80
100
Accuracy
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
Trivia questions
(TriviaQA)
Various exams
(MMLU)
Grade-school
math (GSM8K)
Competition
math (MATH)
SWE tasks
(SWE-bench Verified)
Navigate long horizons with uncertainty
Acquire information in noisy environments
Adapt to a changing world
Coordinate many decisions towards one goal