Atlas / Reports / Detail
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Multimodal Language Models
Connected researchers
Ruoming Pang
Apple
Research scientist working on large-scale deep learning for speech translation, spoken language understanding, question answering, healthcare, and multimodal models.
Yinfei Yang
Apple
Research scientist at Apple focused on natural language processing and machine learning.
Max Schwarzer
Apple
Max Schwarzer is a reinforcement learning researcher whose work focuses on scaling and sample-efficient RL. He completed a PhD at Mila, later interned in Apple's machine learning research group, and was an author on Apple's MM1 multimodal pre-training report.
Floris Weers
Apple
Research scientist at Apple working on efficient and multilingual language modeling, speech and language systems, and large language models.
Brandon McKinzie
Apple
Senior research scientist at Apple working on large multimodal foundation models, with prior work on large language models at MosaicML.
Peter Grasch
Apple
Research scientist at Apple focused on state-of-the-art machine learning and computer vision methods.
Zirui Wang
Apple
Senior researcher at Apple working on large models, multimodal learning, and speech processing, according to his personal site.
Nan Du
Apple
Research scientist at Apple Foundation Models working on large language models and multimodal systems; previously a research scientist at Google and Meta.
Tom Gunter
Apple
Research scientist at Apple Intelligence working on computer vision, machine learning, and natural language processing.
Xiang Kong
Apple
Distinguished scientist at Apple working on large language models and multimodal foundation models; previously held research roles at ByteDance AI Lab and MBZUAI.
Bowen Zhang
Apple
Research scientist at Apple working on large language models, vision-language models, and model scaling.
Dhruti Shah
Apple
Researcher working on machine learning, vision and language, computer vision, diffusion, and generative AI.
Jean-Philippe Fauconnier
Apple
Research scientist at Apple Foundation Models working on generative AI, large language models, and multimodal models.
Philipp Dufter
Apple
Research scientist at Apple Foundation Models with interests in natural language processing, structured generation, controllable generation, and algorithmic efficiency.
Xianzhi Du
Apple
Research scientist at Apple working on language and vision-language modeling, AI agents, and post-training.
Zhe Gan
Apple
Machine learning researcher at Apple working on large multimodal foundation models, video generation, and vision-language systems.
Alexander Toshev
Apple
Computer vision and machine learning scientist at Apple whose work includes multimodal understanding and robotics, following earlier leadership roles at Google.
Anton Belyi
Apple
Research scientist at Apple and adjunct professor at MIPT working in computer vision, image processing, and machine learning.
Futang Peng
Apple
Research scientist at Apple focusing on understanding and generating text and images.
Hongyu He
Apple
Research scientist at Apple focused on computer vision, machine learning, and multimodal understanding.
Sam Wiseman
Apple
Sam Wiseman is an assistant professor of computer science at New York University whose research focuses on natural language processing and machine learning, including controllable generation, summarization, and learning from human feedback.
Mark Lee
Apple
Co-author of MM1, which studies multimodal LLM pre-training.
Haotian Zhang
Apple
Profile still being enriched.
Sam Dodge
Apple
Profile still being enriched.