Artificial General Intelligence

Emergent Abilities of Large Language Models

Wei et al. · 2022 · TMLR

Argued that certain capabilities appear abruptly above a scale threshold rather than improving smoothly.

Research objective

Document capabilities that are absent in smaller models but present in larger ones, and characterize how they emerge.

Methodology

Surveyed model families across orders of magnitude, evaluating accuracy on dozens of tasks including arithmetic, transliteration, and multi-step reasoning.

Key findings

Many tasks show near-random performance up to a scale threshold, then sharp improvement.
Examples include in-context learning, chain-of-thought reasoning, and instruction following.
Emergence is unpredictable from smaller-model behavior alone.

Strengths

Provided empirical evidence for non-linear capability gains.
Influential framing for AGI risk and forecasting debates.

Limitations

Later critiques (Schaeffer 2023) argued some 'emergence' is an artifact of discontinuous metrics.
Sample-efficiency and prompting choices may obscure smoother underlying trends.

Practical implications

Central to debates on whether AGI will arrive gradually or in capability jumps.
Motivates capability evaluations and red-teaming at every new model scale.

Read the original paper

Related entities

Atlas · agi Atlas · superintelligence

Related research

Scaling Laws for Neural Language Models

Showed that LLM performance follows smooth, predictable power-law relationships with compute, data, and parameters.

Read summary

Training Compute-Optimal Large Language Models

Demonstrated that for a fixed compute budget, model size and training tokens should scale roughly equally.

Read summary