Atlas / Reports / Detail
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Audio Language Models report from Alibaba Qwen with 7 connected researchers in the LLMpeople atlas.
Connected researchers
Chao Zhang
Alibaba Qwen
Chao Zhang is an applied scientist in the Alibaba Foundation Model team. His public profile notes a PhD in computer science from the University of Illinois Urbana-Champaign and research interests in NLP, large language models, reasoning, and multimodal generation.
Yeyun Gong
Alibaba Qwen
Yeyun Gong is a researcher and engineering leader focused on multimodal large language models, grounding, and large-scale knowledge systems. His homepage lists selected work including Qwen2-Audio.
Jie Tang
OpenAI / Alibaba Qwen
OpenAI contributor credited on the GPT-4 Technical Report; previously a Dropbox engineer and a Ph.D. student at UC Berkeley focused on machine learning and robotics.
Mingyang Shang
Alibaba Qwen
Research intern at Alibaba Group focused on multimodal understanding and generation, large multimodal models, and reinforcement learning; coauthor of Qwen2-Audio.
Yaqi Wang
Alibaba Qwen
Research scientist in Tongyi Lab and technical lead of Qwen2-Audio, with public work on audio-language models.
Yushi Hu
Alibaba Qwen
Yushi Hu is a senior research engineer at Shanghai AI Laboratory and a founding member of OpenMMLab. Public arXiv records also list him as a coauthor of Qwen2-Audio.
Hongyin Luo
Alibaba Qwen
Researcher whose arXiv author results include Qwen-Audio and related audio-language modeling work.