Research Category
AI Safety & Alignment
Specification, robustness, interpretability, and governance of advanced AI systems.
Papers in this category
Concrete Problems in AI Safety
2016 · arXivAmodei, Olah, Steinhardt, Christiano, Schulman, Mané
Foundational taxonomy of practical safety problems in modern ML systems.
Training Language Models to Follow Instructions with Human Feedback
2022 · OpenAI / NeurIPSOuyang et al.
Introduced InstructGPT and the now-standard RLHF pipeline for aligning LLMs with human intent.
Constitutional AI: Harmlessness from AI Feedback
2022 · AnthropicBai et al.
Trained a helpful, harmless assistant using AI-generated critiques guided by a written constitution.
Browse other categories
Artificial Intelligence
Architectures, training paradigms, and capabilities of modern machine-learning systems.
Open
Artificial General Intelligence
Theoretical foundations and empirical progress toward systems with broad cognitive competence.
Open
Cognitive Neuroscience
How the brain encodes perception, memory, prediction, and decision-making.
Open
Neurotechnology
Brain-computer interfaces, neural decoding, and devices that read or modulate the nervous system.
Open
Neurodivergence
Cognitive variation, atypical processing styles, and the neurobiology of difference.
Open
Human Intelligence
Psychometrics, learning, expertise, and the cognitive architecture of human thought.
Open
