Atlas / Reports / Detail
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Audio Language Models report from Alibaba Qwen with 8 connected researchers in the LLMpeople atlas.
Connected researchers
Jie Tang
OpenAI / Alibaba Qwen
Computer scientist and engineer credited on OpenAI's GPT-4 public contributions page; OpenAI's 2016 team update says he previously led Dropbox's core file sync team after earlier work in Pieter Abbeel's Berkeley robotics lab.
Yeyun Gong
Alibaba Qwen
Yeyun Gong is a researcher and engineering leader focused on multimodal large language models, grounding, and large-scale knowledge systems. His homepage lists selected work including Qwen2-Audio.
Mingyang Shang
Alibaba Qwen
Research intern at Alibaba Group focused on multimodal understanding and generation, large multimodal models, and reinforcement learning; coauthor of Qwen2-Audio.
Yaqi Wang
Alibaba Qwen
Research scientist in Tongyi Lab and technical lead of Qwen2-Audio, with public work on audio-language models.
Yushi Hu
Alibaba Qwen
Yushi Hu is a senior research engineer at Shanghai AI Laboratory and a founding member of OpenMMLab. Public arXiv records also list him as a coauthor of Qwen2-Audio.
Chao Zhang
Alibaba Qwen
Chao Zhang is an applied scientist in the Alibaba Foundation Model team. His public profile notes a PhD in computer science from the University of Illinois Urbana-Champaign and research interests in NLP, large language models, reasoning, and multimodal generation.
Hongyin Luo
Alibaba Qwen
Researcher whose arXiv author results include Qwen-Audio and related audio-language modeling work.