Many-shot Jailbreaking

Senior research engineer at Anthropic interested in agent foundations, model organisms of misalignment, and human-computer interaction.

Member of technical staff at Anthropic and associate professor of computer science, data science, and linguistics at New York University on leave. His public homepage focuses on natural language processing, machine learning, and AI alignment.

Co-founder and head of policy at Anthropic. He previously served as policy director at OpenAI, worked as a technology journalist, and writes the Import AI newsletter.

Anthropic co-founder and Chief Science Officer. Formerly a physicist at Johns Hopkins, he helped develop scaling laws for neural language models and works on the science and safety of large AI systems.

AI safety researcher and director of the Center for AI Safety; advisor to xAI and Scale AI, previously an advisor to OpenAI and Anthropic.

Member of Anthropic's Societal Impacts team, where she studies the real-world impacts of AI systems.

Member of Technical Staff at Anthropic and cofounder of Oulipo Labs, working on language model safety, evaluations, and scientific forecasting.

Co-founder and head of alignment science at Anthropic.

Research scientist at Anthropic interested in understanding neural networks and applying that understanding to alignment.

Researcher working on AI safety and adversarial evaluation, including Anthropic many-shot jailbreaking research.

Research scientist at Anthropic interested in understanding and steering AI systems.

Software engineer at Anthropic, previously at Google, with public writing on language models, agents, and reinforcement learning.

Assistant professor of marketing at Stanford Graduate School of Business whose research uses AI systems to study human decision-making and related machine learning questions.

Research scientist at Anthropic focused on AI alignment, language model behavior, and scalable oversight.

Canonical link

Samuel Marks

Samuel R. Bowman

Jack Clark

Jared D. Kaplan

Dan Hendrycks

Carina Kauf

Nicholas Schiefer

Deep Ganguli

Nova DasSarma

Anna Chen

Saurav Kadavath

Tom Conerly

Esin Durmus

Rylan Schaeffer