Atlas / Reports / Detail
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Large Language Models report from NVIDIA with 15 connected researchers in the LLMpeople atlas.
Connected researchers
Yonggan Fu
NVIDIA
Public profiles say he completed a Georgia Tech PhD in 2025 after earlier study at Rice and USTC, and his current work focuses on bringing frontier AI to everyday devices.
Xin Dong
NVIDIA
Xin Dong's homepage says he leads a research team on LLM training at Seed at ByteDance. It also states that he earned a Harvard PhD in 2023 and previously worked at NVIDIA, Meta, and Tencent.
Shizhe Diao
NVIDIA
Shizhe Diao develops methods to scale post-training and reinforcement learning for large language models and AI agents.
Matthijs Van keirsbilck
NVIDIA
Matthijs Van keirsbilck is a Senior Research Scientist at NVIDIA working on neural network architecture design, structural sparsity, quantization, and training dynamics.
Hanrong Ye
NVIDIA
Hanrong Ye is a research scientist at NVIDIA Research in Santa Clara working on multi-task, multi-media, and multimodality models for machine understanding and generation. He earned a Ph.D. from HKUST, a master's degree from Peking University, and a B.S. from Sun Yat-sen University.
Wonmin Byeon
NVIDIA
NVIDIA Research and Wonmin Byeon's personal site identify him as a researcher at NVIDIA Research in California. Public site materials describe interests in computer vision, robotics, recurrent and state-space models, sequence learning, and spatio-temporal learning.
Yashaswi Karnati
NVIDIA
OpenReview identifies Yashaswi Karnati as a researcher at NVIDIA. His personal homepage describes prior work across intelligent transportation, climate science, data compression, and healthcare, and records completed degrees from the University of Florida and IIT (ISM) Dhanbad.
Lucas Liebenwein
NVIDIA
Works on high-performance LLM inference and AutoDeploy at NVIDIA; previously led efficient-AI work at OmniML and earned graduate degrees at MIT CSAIL.
Nikolaus Binder
NVIDIA
Nikolaus Binder is a senior research scientist at NVIDIA whose public research profile focuses on quasi-Monte Carlo methods, photorealistic image synthesis, ray tracing, and rendering algorithms.
Maksim Khadkevich
NVIDIA
NVIDIA's public author page identifies Maksim Khadkevich as a Senior Software Engineering Manager specializing in distributed inference systems and large language models. arXiv public sources also list him as a coauthor of Nemotron-Flash.
Alexander Keller
NVIDIA
NVIDIA Research identifies Alexander Keller as a senior director of research, formerly chief scientist at mental images and previously a professor at Ulm University. His research interests are at the intersection of graphics, communications, and machine learning.
Jan Kautz
NVIDIA
NVIDIA's research page describes Jan Kautz as vice president of Learning and Perception Research, working across computer vision, machine learning, computational photography, and geometric vision.
Yingyan Celine Lin
NVIDIA
Official Georgia Tech and NVIDIA DLER pages list Yingyan Celine Lin as a Georgia Tech associate professor and a visiting professor collaborating with NVIDIA's deep learning research group.
Pavlo Molchanov
NVIDIA
Pavlo Molchanov leads deep learning efficiency work at NVIDIA Research, with public profiles covering LLM and VLM efficiency, model compression, adaptive inference, and earlier computer vision research.