updated 1 public sources
large multimodal modelsvideo understandingvision-language models

Current frame

Final-year Ph.D. candidate at Show Lab, National University of Singapore, researching large multimodal models for video and currently interning at ByteDance Seed.

Extended note

Joya Chen is a final-year Ph.D. candidate at Show Lab, National University of Singapore, advised by Mike Zheng Shou. According to her public homepage, she is currently interning at ByteDance Seed with the Multimodal Interaction and World Model group on the VLM base model team. Her research centers on large multimodal models for video, with experience spanning data scaling, model architecture, pre-training, post-training, and benchmarking. She previously earned a bachelor's degree from the School of Automotive Engineering at Wuhan University of Technology and a master's degree from the University of Science and Technology of China.