Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback

Yuling Gu is a PhD student at the NYU Center for Data Science studying large language models, machine reasoning, and robust evaluation. She was previously a predoctoral researcher at the Allen Institute for AI, where she contributed to OLMo, OLMo 2, OLMo 3, TULU 3, OLMoE, and OLMES.

Machine learning scientist at Ai2 working on reinforcement learning, language models, and online social systems.

Jiacheng Liu is a researcher at Ai2 whose work focuses on improving the capabilities and understanding of language models. His public homepage says he is currently a PhD student at New York University and has previously spent time at Princeton and Google Research.

Jacob Morrison is a researcher whose work spans language model post-training, alignment, and evaluation. His public research page highlights projects including Tulu 2, Tulu 3, OLMo 2, and RewardBench.

Noah A. Smith is a computer scientist and professor at the University of Washington, where he serves as Vice Provost for Artificial Intelligence and co-directs the OLMo open language modeling effort with Ai2. His research focuses on natural language processing, machine learning, and evaluation methodology.

Yizhong Wang is a research scientist at the Allen Institute for AI and incoming assistant professor at the University of Washington whose work focuses on language models, agents, reasoning, and open-source AI.

Research scientist at the Allen Institute for AI (Ai2) whose work focuses on natural language understanding and commonsense reasoning.

Hanjie Chen is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Nima Rajani is a research scientist at Ai2 whose work focuses on trustworthy, interpretable, and verifiable AI systems.

Machine learning engineer at Ai2 whose public work focuses on open language models, post-training, and evaluation.

Ming Yin is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Research scientist at Ai2 focused on natural language processing, commonsense reasoning, long-form generation, narrative intelligence, and text-based games.

Research scientist at Ai2 working on personalized language models, instruction tuning, and reinforcement learning from human feedback.

Ziyi Yang is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Mustafa Hajij is a research scientist at Ai2 and an adjunct professor in the Department of Computer Science at the University of Southern Maine. His research spans graph machine learning, geometric learning, and applied mathematics.

Yizhu Jiao is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Researcher working on language models, agents, and retrieval-augmented generation; currently at xAI and incoming assistant professor at the University of Washington, previously a research scientist at the Allen Institute for AI.

Research engineer at Ai2 focused on post-training and data for open language models.

Julian Martin Eisenschlos is a Research Scientist at Ai2. His work focuses on natural language processing, language models, and instruction tuning, including contributions to the Tulu 2 project.

Jeremy Dwivedi-Yu is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Alexandre Ramé is a research scientist at Google DeepMind and an adjunct professor at Ecole Polytechnique. His homepage says he previously held research roles at NYU and SCAI / Sorbonne Université, completed a PhD in machine learning at Ecole Polytechnique and ENS Paris-Saclay, and works on post-training and alignment for Gemma LLMs.

Researcher in natural language processing, low-resource languages, machine translation, and responsible AI; publicly listed as a PhD candidate at UC Santa Barbara and a co-author of Tulu 2.

Noel Nabeshima is listed as an author of the Ai2 technical report Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback.

Tony Gracious completed his PhD in the Department of Computer Science and Automation at IISc Bangalore. His work includes representation learning, temporal point processes, and higher-order interaction forecasting, and he later joined Dolby's Advanced Technology Group in Bangalore.

Canonical link

Yuling Gu

Nathan Lambert

Jiacheng Liu

Jacob Morrison

Noah A. Smith

Yizhong Wang

Jena D. Hwang

Hanjie Chen

Nima Rajani

Nicholas Ruas

Ming Yin

Chandra Bhagavatula

Tyler Scialom

Ziyi Yang

Mustafa Hajij

Yizhu Jiao

Bill Yuchen Lin

Aryo Pradipta Gema

Julian Martin Eisenschlos

Jeremy Dwivedi-Yu

Alexandre Ramé

Jesujoba Alabi

Noel Nabeshima

Tony Gracious