Atlas / Reports / Detail
Collective Constitutional AI: Aligning a Language Model with Public Input
Alignment and RLHF
Connected researchers
Jack Clark
Anthropic / OpenAI
Co-founder and head of policy at Anthropic. He previously served as policy director at OpenAI, worked as a technology journalist, and writes the Import AI newsletter.
Zac Hatfield-Dodds
Anthropic
Staff software engineer at Anthropic building systems for AI safety, reliability, and alignment.
Andy Jones
Anthropic
Anthropic researcher working on machine learning and AI-assisted science; previously built tools for learning from text, images, and tabular data.
Amanda Askell
Anthropic / OpenAI
Alignment researcher at OpenAI working on making AI understandable to and aligned with human values.
Jared D. Kaplan
Anthropic
Anthropic co-founder and Chief Science Officer. Formerly a physicist at Johns Hopkins, he helped develop scaling laws for neural language models and works on the science and safety of large AI systems.
Yuntao Bai
Anthropic
Anthropic researcher whose work includes reinforcement learning from human feedback and Constitutional AI; previously a Sherman Fairchild Postdoctoral Scholar in theoretical high-energy physics at Caltech.
Sam McCandlish
Anthropic
Independent researcher working on the theoretical foundations of AI, especially inductive biases, scaling laws, and approximate Bayesian updating. His public homepage notes prior research roles at Anthropic and OpenAI.
Jackson Kernion
Anthropic
Member of Anthropic's Interpretability team, where he works on understanding how large language models work.
Kamal Ndousse
Anthropic
Researcher at Anthropic working on alignment, reasoning, and evaluation for large language models.
Catherine Olsson
Anthropic
Catherine Olsson is an AI alignment researcher and writer whose public website and Anthropic author page describe work on AI safety, interpretability, and building helpful, harmless assistants.
Nicholas Schiefer
Anthropic
Member of Technical Staff at Anthropic and cofounder of Oulipo Labs, working on language model safety, evaluations, and scientific forecasting.
Dario Amodei
Anthropic / OpenAI
CEO and co-founder of Anthropic. Before Anthropic, he served as vice president of research at OpenAI.
Nova DasSarma
Anthropic
Research scientist at Anthropic interested in understanding neural networks and applying that understanding to alignment.
Anna Chen
Anthropic
Researcher working on AI safety and adversarial evaluation, including Anthropic many-shot jailbreaking research.
Saurav Kadavath
Anthropic
Research scientist at Anthropic interested in understanding and steering AI systems.
Tom Conerly
Anthropic
Software engineer at Anthropic, previously at Google, with public writing on language models, agents, and reinforcement learning.
Ben Mann
Anthropic
Researcher interested in neural networks and their potential to achieve general intelligence. His public homepage notes prior roles as a cofounder at Anthropic, researcher at OpenAI, and member of the startup team at Stripe.
Nicholas Joseph
Anthropic
Researcher at Anthropic working on the alignment and evaluation of advanced AI systems.
Tom Brown
Anthropic
Research scientist at Anthropic working on model behavior and interpretability.
Herbie Bradley
Anthropic
Computer scientist and machine learning researcher with public work spanning AI systems and alignment-related research.
Nelson Elhage
Anthropic
Profile still being enriched.
Jared Mueller
Anthropic
Profile still being enriched.
Joshua Landau
Anthropic
Profile still being enriched.
Timothy Telleen-Lawton
Anthropic
Profile still being enriched.