Atlas / Reports / Detail
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Vision-Language Models
Connected researchers
Sifan Zhou
DeepSeek
Researcher at DeepSeek AI interested in generative models, large language models, multimodal learning, and computer vision. He is pursuing a PhD in electrical engineering at Stanford University after earning a bachelor's degree from Tsinghua University, and has also worked at Meta AI and Google.
Yue Cao
DeepSeek
Yue Cao is a researcher working on multimodal large language models and computer vision. His public homepage lists previous time at DeepSeek and Apple and links to work including DeepSeek-VL2.
Wei Xiong
DeepSeek
Research scientist at DeepSeek working on large language models, multimodal learning, and machine learning systems. He was previously an applied scientist at AWS AI Labs and earned a PhD in computer science from Johns Hopkins University.
Yufei Zhang
DeepSeek
Researcher at the University of Illinois Urbana-Champaign focused on vision-language models, multimodal large language models, and physical AI.
Zhengyang Wang
DeepSeek
Research intern at DeepSeek and master's student at Renmin University of China working on multimodal large language models and AI agents.
Xinlong Wang
DeepSeek
Xinlong Wang is a researcher working across computer vision, embodied AI, robotics, and machine learning. Public profiles link him to OpenGVLab and Shanghai AI Laboratory, and he is a coauthor of DeepSeek-VL2.
Wenhai Wang
DeepSeek
Wenhai Wang is a researcher working on visual perception foundation models, efficient learning, and multimodal large models. Public profiles list him with OpenGVLab and Shanghai AI Laboratory, and he is a coauthor of DeepSeek-VL2.
Shujie Wang
DeepSeek
First-year PhD student at Shanghai Jiao Tong University focused on multimodal large language models, text-to-image generation, and image/video generation; coauthor of DeepSeek-VL2.
Yonggang Zhang
DeepSeek
Yonggang Zhang is a researcher whose public OpenReview profile includes the DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding paper.
Xinyu Chen
DeepSeek
Research intern at NUS and Nanjing University working on machine learning and multimodal large language models; coauthor of DeepSeek-VL2.
Yanxia Cui
DeepSeek
Researcher working on multimodal and vision-language models, including DeepSeek-VL2 and related model optimization work.
Zihan Liu
DeepSeek
Zihan Liu is a research scientist at DeepSeek. His public homepage highlights work in multimodal learning, vision-language models, and large-scale machine learning.
Yao Lu
DeepSeek / Google
Profile still being enriched.