Large Language Models9 people

GLM-5: Thinking, Coding, and Agentic Intelligence

Z.ai

Large Language Models · 2602.15763 · 2026-02-17

Large Language Models2602.15763
Code Language Models5 people

CWM: An Open-Weights LLM for Research on Code Generation with World Models

Meta AI

Code Language Models · 2509.12054 · 2025-09-24

Code Language Models2509.12054
Text Embedding Models4 people

EmbeddingGemma: Open Models for Text Similarity Search

Google Gemini

Text Embedding Models · 2509.20354 · 2025-09-24

Text Embedding Models2509.20354
Multimodal Language Models19 people

Apple Intelligence Foundation Language Models: Tech Report 2025

Apple

Multimodal Language Models · 2507.13575 · 2025-07-16

Multimodal Language Models2507.13575
Medical Multimodal Models45 people

MedGemma Technical Report

Google Gemini

Medical Multimodal Models · 2507.05201 · 2025-07-07

Medical Multimodal Models2507.05201
Reasoning Models6 people

Magistral: Efficient Training of Small Language Models for Reasoning

Mistral AI

Reasoning Models · 2506.10910 · 2025-06-12

Reasoning Models2506.10910
Diffusion Language Models5 people

On Gemini Diffusion

Google Gemini

Diffusion Language Models · 2505.20099 · 2025-05-27

Diffusion Language Models2505.20099
Speech Language Models5 people

Amazon Nova Sonic Technical Report

Amazon

Speech Language Models · 2505.11298 · 2025-05-15

Speech Language Models2505.11298
Reasoning Models12 people

Phi-4-mini-reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Microsoft

Reasoning Models · 2504.21233 · 2025-04-29

Reasoning Models2504.21233
Reasoning Models4 people

Phi-4-reasoning Technical Report

Microsoft

Reasoning Models · 2504.21318 · 2025-04-29

Reasoning Models2504.21318
Large Language Models79 people

Command A: An Enterprise-Ready Large Language Model

Cohere

Large Language Models · 2504.00698 · 2025-04-01

Large Language Models2504.00698
Large Language Models13 people

Mistral Small 3.1 Technical Report

Mistral AI

Large Language Models · 2503.23335 · 2025-03-31

Large Language Models2503.23335
Robotics6 people

Gemini Robotics-ER: Transforming Robotic Embodiment

Google Gemini

Robotics · 2503.20031 · 2025-03-27

Robotics2503.20031
Reasoning Models5 people

QwQ-32B: Embracing the Power of Reinforcement Learning

Alibaba Qwen

Reasoning Models · 2503.20735 · 2025-03-27

Reasoning Models2503.20735
Robotics Multimodal Models78 people

Gemini Robotics: Bringing AI into the Physical World

Google Gemini

Robotics Multimodal Models · 2503.20020 · 2025-03-27

Robotics Multimodal Models2503.20020
Multimodal Models10 people

Qwen2.5-Omni Technical Report

Alibaba Qwen

Multimodal Models · 2503.20215 · 2025-03-23

Multimodal Models2503.20215
Text Embedding Models2 people

Gemini Embedding: Generalizable Embeddings From Gemini

Google Gemini

Text Embedding Models · 2503.07891 · 2025-03-11

Text Embedding Models2503.07891
Language Models12 people

Phi-4 Technical Report

Microsoft

Language Models · 2503.01743 · 2025-03-03

Language Models2503.01743
Vision-Language Models27 people

Qwen2.5-VL Technical Report

Alibaba Qwen

Vision-Language Models · 2502.13923 · 2025-02-19

Vision-Language Models2502.13923
Vision-Language Models85 people

Seed1.5-VL Technical Report

ByteDance Seed

Vision-Language Models · 2505.07062 · 2025-01-22

Vision-Language Models2505.07062
Large Language Models11 people

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Moonshot AI

Large Language Models · 2501.12599 · 2025-01-21

Large Language Models2501.12599
Large Language Models32 people

2 OLMo 2 Furious

Ai2

Large Language Models · 2501.00656 · 2024-12-31

Large Language Models2501.00656
Vision-Language Models12 people

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

DeepSeek

Vision-Language Models · 2412.10302 · 2024-12-12

Vision-Language Models2412.10302
Multimodal Language Models10 people

NVLM: Open Frontier-Class Multimodal LLMs

NVIDIA

Multimodal Language Models · 2412.04468 · 2024-12-05

Multimodal Language Models2412.04468
Audio Language Models9 people

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbots

Z.ai

Audio Language Models · 2412.02612 · 2024-12-04

Audio Language Models2412.02612
Large Language Models21 people

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

Ai2

Large Language Models · 2411.15124 · 2024-11-22

Large Language Models2411.15124
Vision-Language Models20 people

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

DeepSeek

Vision-Language Models · 2411.07975 · 2024-11-11

Vision-Language Models2411.07975
Large Language Models3 people

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Tencent Hunyuan

Large Language Models · 2411.02265 · 2024-11-04

Large Language Models2411.02265
Vision-Language Models20 people

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

DeepSeek

Vision-Language Models · 2410.13848 · 2024-10-18

Vision-Language Models2410.13848
Vision-Language Models16 people

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

Ai2

Vision-Language Models · 2409.17146 · 2024-09-25

Vision-Language Models2409.17146
Vision-Language Models26 people

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Alibaba Qwen

Vision-Language Models · 2409.12191 · 2024-09-18

Vision-Language Models2409.12191
Large Language Models13 people

OLMoE: Open Mixture-of-Experts Language Models

Ai2

Large Language Models · 2409.02060 · 2024-09-03

Large Language Models2409.02060
Mathematical Reasoning Models8 people

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

DeepSeek

Mathematical Reasoning Models · 2408.08152 · 2024-08-14

Mathematical Reasoning Models2408.08152
Multimodal Language Models29 people

Apple Intelligence Foundation Language Models

Apple

Multimodal Language Models · 2407.21075 · 2024-07-29

Multimodal Language Models2407.21075
Audio Language Models26 people

Qwen2-Audio Technical Report

Alibaba Qwen

Audio Language Models · 2407.10759 · 2024-07-14

Audio Language Models2407.10759
Vision-Language Models12 people

PaliGemma: A versatile 3B VLM for transfer

Google Gemini

Vision-Language Models · 2407.07726 · 2024-07-10

Vision-Language Models2407.07726
Large Language Models11 people

Open Instruct: A Simple Method for Aligning Language Models with Human Preferences

Ai2

Large Language Models · 2406.18405 · 2024-06-26

Large Language Models2406.18405
Code Language Models9 people

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

DeepSeek

Code Language Models · 2406.11931 · 2024-06-17

Code Language Models2406.11931
Code Language Models16 people

CodeGemma: Open Code Models Based on Gemma

Google Gemini

Code Language Models · 2406.11409 · 2024-06-17

Code Language Models2406.11409
Medical Multimodal Models28 people

Advancing Multimodal Medical Capabilities of Gemini

Google Gemini

Medical Multimodal Models · 2405.03162 · 2024-05-06

Medical Multimodal Models2405.03162
Language Models5 people

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Microsoft

Language Models · 2404.14219 · 2024-04-22

Language Models2404.14219
Large Language Models27 people

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Google Gemini

Large Language Models · 2404.07839 · 2024-04-11

Large Language Models2404.07839
Language Models8 people

Jamba: A Hybrid Transformer-Mamba Language Model

AI21 Labs

Language Models · 2403.19887 · 2024-03-28

Language Models2403.19887
Vision-Language Models5 people

DeepSeek-VL: Towards Real-World Vision-Language Understanding

DeepSeek

Vision-Language Models · 2403.05525 · 2024-03-08

Vision-Language Models2403.05525
Large Language Models12 people

Nemotron-4 15B Technical Report

NVIDIA

Large Language Models · 2402.16819 · 2024-02-26

Large Language Models2402.16819
Alignment and Safety14 people

Many-shot Jailbreaking

Anthropic

Alignment and Safety · 2402.03206 · 2024-02-12

Alignment and Safety2402.03206
Speech Language Models10 people

SPIrit-LM: Interleaved Spoken and Written Language Model

Meta AI

Speech Language Models · 2402.05755 · 2024-02-09

Speech Language Models2402.05755
Mathematical Reasoning Models8 people

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

DeepSeek

Mathematical Reasoning Models · 2402.03300 · 2024-02-06

Mathematical Reasoning Models2402.03300
Large Language Models12 people

Mixtral of Experts

Mistral AI

Large Language Models · 2401.04088 · 2024-01-08

Large Language Models2401.04088
Large Language Models17 people

Tulu 2: Demystifying the Effectiveness of RLHF and Reinforcement Learning with Human Feedback

Ai2

Large Language Models · 2311.10702 · 2023-11-17

Large Language Models2311.10702
Audio Language Models7 people

Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Alibaba Qwen

Audio Language Models · 2311.07919 · 2023-11-13

Audio Language Models2311.07919
Large Language Models16 people

Mistral 7B

Mistral AI

Large Language Models · 2310.06825 · 2023-10-10

Large Language Models2310.06825
Alignment and RLHF20 people

Collective Constitutional AI: Aligning a Language Model with Public Input

Anthropic

Alignment and RLHF · 2310.01835 · 2023-10-03

Alignment and RLHF2310.01835
Vision-Language Models13 people

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

Alibaba Qwen

Vision-Language Models · 2308.12966 · 2023-08-24

Vision-Language Models2308.12966
Code Language Models7 people

Code Llama: Open Foundation Models for Code

Meta AI

Code Language Models · 2308.12950 · 2023-08-24

Code Language Models2308.12950
Large Language Models14 people

LLaMA: Open and Efficient Foundation Language Models

Meta AI

Large Language Models · 2302.13971 · 2023-02-27

Large Language Models2302.13971
Alignment and RLHF35 people

Constitutional AI: Harmlessness from AI Feedback

Anthropic

Alignment and RLHF · 2212.08073 · 2022-12-15

Alignment and RLHF2212.08073
Alignment and RLHF26 people

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Anthropic

Alignment and RLHF · 2204.05862 · 2022-04-12

Alignment and RLHF2204.05862
Alignment11 people

Training language models to follow instructions with human feedback

OpenAI

Alignment · 2203.02155 · 2022-03-04

Alignment2203.02155
Large Language Models31 people

Language Models are Few-Shot Learners

OpenAI

Large Language Models · 2005.14165 · 2020-05-28

Large Language Models2005.14165
Report18 people

GPT-4o System Card

OpenAI

2410.21276

2410.21276
Report4 people

BitNet b1.58 2B4T Technical Report

Microsoft

2504.12285

2504.12285
Report3 people

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Microsoft

2402.17764

2402.17764
Report1 people

FastVLM: Efficient Vision Encoding for Vision Language Models

Apple

2412.13303

2412.13303
Report1 people

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Apple

2404.14619

2404.14619
Report133 people

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

ByteDance Seed

2504.13914

2504.13914
Vision-Language Models1 people

MiniMax-VL-01

MiniMax

Vision-Language Models · 2501.08336

Vision-Language Models2501.08336
Interpretability14 people

Tracing the thoughts of a large language model

Anthropic

Interpretability · 2503.21435

Interpretability2503.21435
Interpretability7 people

On the Biology of a Large Language Model

Anthropic

Interpretability · 2504.19173

Interpretability2504.19173
Alignment and Safety14 people

Constitutional Classifiers++: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Anthropic

Alignment and Safety · 2601.04603

Alignment and Safety2601.04603
Alignment and Safety7 people

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Anthropic

Alignment and Safety · 2501.18837

Alignment and Safety2501.18837
Speech Language Models11 people

Voxtral Technical Report

Mistral AI

Speech Language Models · 2507.13264

Speech Language Models2507.13264
Reasoning Models3 people

Nemotron-CrossThink: Efficient Knowledge Distillation of Long Chain-of-Thought Reasoning

NVIDIA

Reasoning Models · 2504.13941

Reasoning Models2504.13941
Reasoning Models4 people

Nemotron 3 Super: Open, efficient mixture-of-experts hybrid mamba-transformer model for agentic reasoning

NVIDIA

Reasoning Models · 2601.11868

Reasoning Models2601.11868
Large Language Models4 people

NVIDIA Nemotron 3: Efficient and Open Intelligence

NVIDIA

Large Language Models · 2512.20856

Large Language Models2512.20856
Reasoning Models4 people

Nemotron 3 nano: Open, efficient mixture-of-experts hybrid mamba-transformer model for agentic reasoning

NVIDIA

Reasoning Models · 2512.20848

Reasoning Models2512.20848
Reasoning Models2 people

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

NVIDIA

Reasoning Models · 2508.14444

Reasoning Models2508.14444
Large Language Models4 people

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

NVIDIA

Large Language Models · 2504.03624

Large Language Models2504.03624
Mathematical Reasoning Models6 people

DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning and Monte-Carlo Tree Search with Proof Assistant Feedback

DeepSeek

Mathematical Reasoning Models · 2508.03613

Mathematical Reasoning Models2508.03613
Language Models5 people

Large Concept Models: Language Modeling in a Sentence Representation Space

Meta AI

Language Models · 2502.06018

Language Models2502.06018
Multimodal Models16 people

Qwen3-Omni Technical Report

Alibaba Qwen

Multimodal Models · 2509.17765

Multimodal Models2509.17765
Speech Language Models7 people

MiniMax-Speech: Intrinsic Zero-Shot Speech Understanding for Advanced Foundation Models

MiniMax

Speech Language Models · 2505.07916

Speech Language Models2505.07916
Language Models4 people

GLM-4.5: Agentic, Reasoning, and Coding Foundation Models

Z.ai

Language Models · 2508.06471

Language Models2508.06471
Reasoning Models4 people

GLM-Z1-Rumination: An Open Frontier-Class Reasoning Model Through Test-Time Scaling

Z.ai

Reasoning Models · 2506.17434

Reasoning Models2506.17434
Vision-Language Models10 people

PaliGemma 2: A Family of Versatile VLMs for Transfer

Google Gemini

Vision-Language Models · 2412.03555

Vision-Language Models2412.03555
Alignment and Safety11 people

Auditing language models for hidden objectives

Anthropic

Alignment and Safety · 2507.11473

Alignment and Safety2507.11473
Alignment and Safety12 people

Alignment faking in large language models

Anthropic

Alignment and Safety · 2412.14093

Alignment and Safety2412.14093
Alignment and Safety28 people

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Anthropic

Alignment and Safety · 2401.05566

Alignment and Safety2401.05566
Large Language Models17 people

Amazon Nova Premier Technical Report

Amazon

Large Language Models · 2504.01081

Large Language Models2504.01081
Multimodal Language Models7 people

Aya Vision: Advancing the Frontier of Multilingual Multimodality

Cohere

Multimodal Language Models · 2410.14756

Multimodal Language Models2410.14756
Multimodal Language Models17 people

MM1.5: Methods, Analysis and Insights from Multimodal LLM Fine-tuning

Apple

Multimodal Language Models · 2409.20566

Multimodal Language Models2409.20566
Multimodal Language Models21 people

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Apple

Multimodal Language Models · 2403.09611

Multimodal Language Models2403.09611
Reasoning Models16 people

OpenAI o3 and o4-mini System Card

OpenAI

Reasoning Models · 2504.21798

Reasoning Models2504.21798
Reasoning Models18 people

OpenAI o1 System Card

OpenAI

Reasoning Models · 2412.16720

Reasoning Models2412.16720
Large Language Models74 people

Nemotron-4 340B Technical Report

NVIDIA

Large Language Models · 2406.11704

Large Language Models2406.11704
Language Models12 people

Jamba 1.5 Technical Report

AI21 Labs

Language Models · 2508.15167

Language Models2508.15167
Large Language Models102 people

PaLM 2 Technical Report

Google Gemini

Large Language Models · 2305.10403

Large Language Models2305.10403
Multimodal Language Models18 people

PaLM-E: An Embodied Multimodal Language Model

Google Gemini

Multimodal Language Models · 2303.03378

Multimodal Language Models2303.03378
Code Language Models8 people

Qwen2.5-Coder Technical Report

Alibaba Qwen

Code Language Models · 2409.12186

Code Language Models2409.12186
Large Language Models30 people

OLMo: Accelerating the Science of Language Models

Ai2

Large Language Models · 2402.00838

Large Language Models2402.00838
Large Language Models21 people

PaLM: Scaling Language Modeling with Pathways

Google Gemini

Large Language Models · 2204.02311

Large Language Models2204.02311
Multimodal Large Language Models17 people

Pixtral 12B

Mistral AI

Multimodal Large Language Models · 2410.17897

Multimodal Large Language Models2410.17897
Multimodal Large Language Models44 people

Gemma 3n Technical Report

Google Gemini

Multimodal Large Language Models · 2505.13426

Multimodal Large Language Models2505.13426
Multimodal Large Language Models13 people

Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

DeepSeek

Multimodal Large Language Models · 2501.17811

Multimodal Large Language Models2501.17811
Reasoning Large Language Models20 people

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

MiniMax

Reasoning Large Language Models · 2506.13585

Reasoning Large Language Models2506.13585
Multimodal Large Language Models20 people

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Meta AI

Multimodal Large Language Models · 2405.09818

Multimodal Large Language Models2405.09818
Multimodal Large Language Models11 people

Gemma 3 Technical Report

Google Gemini

Multimodal Large Language Models · 2503.19786

Multimodal Large Language Models2503.19786
Large Language Models16 people

Gemma 2: Improving Open Language Models at a Practical Size

Google Gemini

Large Language Models · 2408.00118

Large Language Models2408.00118
Large Language Models20 people

Gemma: Open Models Based on Gemini Research and Technology

Google Gemini

Large Language Models · 2403.08295

Large Language Models2403.08295
Large Language Models8 people

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek

Large Language Models · 2501.12948

Large Language Models2501.12948
Multimodal Models8 people

GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Z.ai

Multimodal Models · 2507.01006

Multimodal Models2507.01006
Large Language Models12 people

MiniMax-01: Scaling Foundation Models with Lightning Attention

MiniMax

Large Language Models · 2501.08313

Large Language Models2501.08313
Large Language Models45 people

The Llama 3 Herd of Models

Meta AI

Large Language Models · 2407.21783

Large Language Models2407.21783
Large Language Models20 people

Llama 2: Open Foundation and Fine-Tuned Chat Models

Meta AI

Large Language Models · 2307.09288

Large Language Models2307.09288
Multimodal Models51 people

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Google Gemini

Multimodal Models · 2403.05530

Multimodal Models2403.05530
Multimodal Models58 people

Gemini: A Family of Highly Capable Multimodal Models

Google Gemini

Multimodal Models · 2312.11805

Multimodal Models2312.11805
Multimodal Agentic Models238 people

Kimi K2.5: Visual Agentic Intelligence

Moonshot AI

Multimodal Agentic Models · 2602.02276

Multimodal Agentic Models2602.02276
Vision-Language Models94 people

Kimi-VL Technical Report

Moonshot AI

Vision-Language Models · 2504.07491

Vision-Language Models2504.07491
Large Language Models62 people

DeepSeek LLM Technical Report

DeepSeek

Large Language Models · 2401.02954

Large Language Models2401.02954
Large Language Models98 people

DeepSeek-V2 Technical Report

DeepSeek

Large Language Models · 2405.04434

Large Language Models2405.04434
Large Language Models128 people

DeepSeek-V3 Technical Report

DeepSeek

Large Language Models · 2412.19437

Large Language Models2412.19437
Large Language Models42 people

Qwen2.5 Technical Report

Alibaba Qwen

Large Language Models · 2412.15115

Large Language Models2412.15115
Large Language Models60 people

Qwen3 Technical Report

Alibaba Qwen

Large Language Models · 2505.09388

Large Language Models2505.09388
Large Language Models48 people

Qwen Technical Report

Alibaba Qwen

Large Language Models · 2309.16609

Large Language Models2309.16609
Large Language Models277 people

GPT-4 Technical Report

OpenAI

Large Language Models · 2303.08774

Large Language Models2303.08774