Atlas / Reports / Detail
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Public report from NVIDIA with 14 connected researchers in the LLMpeople atlas.
Connected researchers
Fu, Yonggan
NVIDIA
Public profiles say he completed a Georgia Tech PhD in 2025 after earlier study at Rice and USTC, and his current work focuses on bringing frontier AI to everyday devices.
Byeon, Wonmin
NVIDIA
NVIDIA Research and Wonmin Byeon's personal site identify him as a researcher at NVIDIA Research in California. Public site materials describe interests in computer vision, robotics, recurrent and state-space models, sequence learning, and spatio-temporal learning.
Kautz, Jan
NVIDIA
NVIDIA's research page describes Jan Kautz as vice president of Learning and Perception Research, working across computer vision, machine learning, computational photography, and geometric vision.
Ye, Hanrong
NVIDIA
Hanrong Ye is a research scientist at NVIDIA Research in Santa Clara working on multi-task, multi-media, and multimodality models for machine understanding and generation. He earned a Ph.D. from HKUST, a master's degree from Peking University, and a B.S. from Sun Yat-sen University.
Karnati, Yashaswi
NVIDIA
OpenReview identifies Yashaswi Karnati as a researcher at NVIDIA. His personal homepage describes prior work across intelligent transportation, climate science, data compression, and healthcare, and records completed degrees from the University of Florida and IIT (ISM) Dhanbad.
Diao, Shizhe
NVIDIA
Shizhe Diao develops methods to scale post-training and reinforcement learning for large language models and AI agents.
Van keirsbilck, Matthijs
NVIDIA
Matthijs Van keirsbilck is a Senior Research Scientist at NVIDIA working on neural network architecture design, structural sparsity, quantization, and training dynamics.
Dong, Xin
NVIDIA
Xin Dong's homepage says he leads a research team on LLM training at Seed at ByteDance. It also states that he earned a Harvard PhD in 2023 and previously worked at NVIDIA, Meta, and Tencent.
Keller, Alexander
NVIDIA
NVIDIA Research identifies Alexander Keller as a senior director of research, formerly chief scientist at mental images and previously a professor at Ulm University. His research interests are at the intersection of graphics, communications, and machine learning.
Molchanov, Pavlo
NVIDIA
Pavlo Molchanov leads deep learning efficiency work at NVIDIA Research, with public profiles covering LLM and VLM efficiency, model compression, adaptive inference, and earlier computer vision research.
Liebenwein, Lucas
NVIDIA
Works on high-performance LLM inference and AutoDeploy at NVIDIA; previously led efficient-AI work at OmniML and earned graduate degrees at MIT CSAIL.
Binder, Nikolaus
NVIDIA
Nikolaus Binder is a senior research scientist at NVIDIA whose public research profile focuses on quasi-Monte Carlo methods, photorealistic image synthesis, ray tracing, and rendering algorithms.
Lin, Yingyan Celine
NVIDIA
Official Georgia Tech and NVIDIA DLER pages list Yingyan Celine Lin as a Georgia Tech associate professor and a visiting professor collaborating with NVIDIA's deep learning research group.
Khadkevich, Maksim
NVIDIA
NVIDIA's public author page identifies Maksim Khadkevich as a Senior Software Engineering Manager specializing in distributed inference systems and large language models. arXiv public sources also list him as a coauthor of Nemotron-Flash.