Speech Language Models | Field

Amelie Royer is a research scientist at Kyutai in Paris working on efficient large models and speech-language systems. Before joining Kyutai in June 2024, she spent three years at Qualcomm AI Research working on efficient and multimodal machine learning. She earned a PhD in computer science from ENS Paris-Saclay and LIP6 in 2021 after a master's in applied mathematics from ENS Paris-Saclay.

Research scientist at Meta AI working on natural language processing and AI safety. His homepage says he completed a PhD at Facebook AI Research and Inria focused on text simplification and accessibility.

Aleksandra Piktus is a research scientist at Mistral AI. Her public speaker profiles say she works on multilinguality, representation learning, and responsible AI, and previously worked at the University of Cambridge on multimodal representation learning and cross-lingual transfer, where she completed a PhD.

Public profiles describe Wenchao Zhou as Director of Data Product and Data Analytics at Alibaba Cloud Intelligence and a former tenured computer science faculty member at Georgetown University. His work centers on databases and distributed systems.

AI researcher focused on evaluating language models and agents, open NLP research, and historical linguistics. She led evaluation efforts at Hugging Face between 2023 and 2025 and helped build LightEval and the Open LLM Leaderboard.

Jean-Baptiste Alayrac is a researcher focused on multimodal learning, vision-language modeling, and video understanding.

Research scientist at Meta FAIR working on speech and audio foundation models. His research covers self-supervised learning, spoken language modeling, and multimodal audio-language systems.

Yossi Adi is a computer scientist at the Hebrew University of Jerusalem and a research scientist at Meta FAIR. His research focuses on speech, audio, and language modeling, including spoken language models and machine learning methods for speech applications.

Co-founder and CEO of Mistral AI and a researcher on efficient large language models and mixture-of-experts systems.

Mistral AI's about page lists Guillaume Lample as one of the company's three founders. His OpenReview profile lists expertise in machine translation and natural language processing and a PhD in computer science at Universite Pierre et Marie Curie - Paris 6.

Research scientist at Meta AI working on generative AI, multimodal learning, and speech and audio generation. His public homepage notes earlier research at Bar-Ilan University before joining Meta in Menlo Park.

Pierre Sennrich is Chief Scientist at Mistral AI and a professor at the University of Zurich. His research centers on natural language processing and machine translation, and he has led widely cited work on subword methods and multilingual language technology.

Tu Anh Nguyen is a research scientist at Meta working on speech and audio generation. He is also a PhD candidate at Mila and the Universite de Montreal, advised by Yoshua Bengio and Abdelrahman Mohamed, with interests in audio language models, speech generation, and efficient inference.

Timothée Lacroix is a machine learning researcher and one of the founders of Mistral AI.

Researcher at Moonshot AI and co-author of the Kimi K2.5 report on visual agentic intelligence.

Research scientist and professor working across Meta, NYU, and EHESS on speech, language, and cognitive science. His work studies how humans and machines acquire language and how spoken and written models can be aligned.

Professor of Electrical Engineering and Computer Sciences at the University of California, Berkeley, known for work spanning artificial intelligence, machine learning, and related computing systems research.

Senior applied scientist on the Amazon AGI team working on multimodal generative AI, speech recognition, and spoken language understanding, and a co-author of the Amazon Nova Sonic technical report.

Research scientist working on natural language processing, with public work spanning speech and language modeling such as VoxPopuli, pGSLM, and SPIrit-LM.

Applied scientist at Amazon AGI working on speech, spoken language translation, and multimodal generative AI, and a co-author of the Amazon Nova Sonic technical report.

Public report authorship links Qingyang Ge to the MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention at MiniMax.

Researcher working on speech and multimodal language models, including MiniMax-Speech and related speech understanding work.

Lead of foundation models at MiniMax working on large language models, multimodal pretraining, and efficient training systems. He completed a PhD in computer science at Tsinghua University.

Research scientist at Mistral AI and co-author of the Mistral 7B report.

Research scientist at MiniMax AI Research focused on reinforcement learning, reasoning, multimodal learning, large language models, and large-scale distributed systems. He received a PhD in machine learning from Carnegie Mellon University.