Atlas / Reports / Detail
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Audio Language Models
Connected researchers
Chao Zhang
Qwen
Chao Zhang is an applied scientist in the Alibaba Foundation Model team. His public profile notes a PhD in computer science from the University of Illinois Urbana-Champaign and research interests in NLP, large language models, reasoning, and multimodal generation.
Yeyun Gong
Qwen
Yeyun Gong is a researcher and engineering leader focused on multimodal large language models, grounding, and large-scale knowledge systems. His homepage lists selected work including Qwen2-Audio.
Jie Tang
OpenAI / Qwen
OpenAI contributor credited on the GPT-4 Technical Report; previously a Dropbox engineer and a Ph.D. student at UC Berkeley focused on machine learning and robotics.
Mingyang Shang
Qwen
Research intern at Alibaba Group focused on multimodal understanding and generation, large multimodal models, and reinforcement learning; coauthor of Qwen2-Audio.
Yaqi Wang
Qwen
Research scientist in Tongyi Lab and technical lead of Qwen2-Audio, with public work on audio-language models.
Yushi Hu
Qwen
Yushi Hu is a senior research engineer at Shanghai AI Laboratory and a founding member of OpenMMLab. Public arXiv records also list him as a coauthor of Qwen2-Audio.
Hongyin Luo
Qwen
Researcher whose arXiv author results include Qwen-Audio and related audio-language modeling work.
Qingqing Zheng
Qwen
Co-author of the Qwen-Audio technical report on unified large-scale audio-language models.