Artificial General Intelligence
Emergent Abilities of Large Language Models
Wei et al. · 2022 · TMLR
Argued that certain capabilities appear abruptly above a scale threshold rather than improving smoothly.
Research objective
Document capabilities that are absent in smaller models but present in larger ones, and characterize how they emerge.
Methodology
Surveyed model families across orders of magnitude, evaluating accuracy on dozens of tasks including arithmetic, transliteration, and multi-step reasoning.
Key findings
- Many tasks show near-random performance up to a scale threshold, then sharp improvement.
- Examples include in-context learning, chain-of-thought reasoning, and instruction following.
- Emergence is unpredictable from smaller-model behavior alone.
Strengths
- Provided empirical evidence for non-linear capability gains.
- Influential framing for AGI risk and forecasting debates.
Limitations
- Later critiques (Schaeffer 2023) argued some 'emergence' is an artifact of discontinuous metrics.
- Sample-efficiency and prompting choices may obscure smoother underlying trends.
Practical implications
- Central to debates on whether AGI will arrive gradually or in capability jumps.
- Motivates capability evaluations and red-teaming at every new model scale.
Related entities
Related research
Scaling Laws for Neural Language Models
Showed that LLM performance follows smooth, predictable power-law relationships with compute, data, and parameters.
Read summary
Training Compute-Optimal Large Language Models
Demonstrated that for a fixed compute budget, model size and training tokens should scale roughly equally.
Read summary
