Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

222

Base only

Active filters: vLLM

mistralai/Mistral-Medium-3.5-128B

128B • Updated May 4 • 444k • 366

mistralai/Mistral-Small-4-119B-2603

119B • Updated 19 days ago • 73.5k • 396

mistralai/Mistral-Small-4-119B-2603-NVFP4

Updated 19 days ago • 1.41k • 100

QuantTrio/Qwen3.5-9B-AWQ

Image-Text-to-Text • 10B • Updated Mar 4 • 559k • 22

mistralai/Mistral-Medium-3.5-128B-EAGLE

Updated Apr 30 • 279 • 48

unsloth/Mistral-Small-4-119B-2603-GGUF

119B • Updated Apr 20 • 8.98k • 73

QuantTrio/Qwen3-235B-A22B-Instruct-2507-AWQ

Text Generation • 235B • Updated Aug 19, 2025 • 13.5k • 11

QuantTrio/Qwen3.5-122B-A10B-AWQ

Image-Text-to-Text • 125B • Updated Feb 26 • 15.5k • 28

mistralai/Mistral-Small-4-119B-2603-eagle

Updated Apr 27 • 296 • 52

bartowski/mistralai_Mistral-Small-4-119B-2603-GGUF

Image-Text-to-Text • 119B • Updated Mar 22 • 2.49k • 12

mradermacher/Mistral-Small-4-119B-2603-GGUF

119B • Updated Mar 22 • 265 • 1

mradermacher/Mistral-Small-4-119B-2603-i1-GGUF

119B • Updated 13 days ago • 4.46k • 3

QuantTrio/Qwen3.6-27B-AWQ

Image-Text-to-Text • 28B • Updated Apr 23 • 524k • 15

model-scope/glm-4-9b-chat-GPTQ-Int4

Text Generation • 9B • Updated Jul 17, 2024 • 64 • 6

model-scope/glm-4-9b-chat-GPTQ-Int8

Text Generation • 9B • Updated Jul 23, 2024 • 5 • 2

tclf90/qwen2.5-72b-instruct-gptq-int4

Text Generation • 73B • Updated May 12, 2025 • 100 • 2

tclf90/qwen2.5-72b-instruct-gptq-int3

Text Generation • 69B • Updated May 12, 2025 • 89

prithivMLmods/Nu2-Lupi-Qwen-14B

Text Generation • 15B • Updated Mar 27, 2025 • 7 • 2

mradermacher/Nu2-Lupi-Qwen-14B-GGUF

15B • Updated Jul 11, 2025 • 96 • 1

mradermacher/Nu2-Lupi-Qwen-14B-i1-GGUF

15B • Updated Jul 11, 2025 • 261 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int4

Text Generation • 0.6B • Updated Sep 3, 2025 • 230 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int8

Text Generation • 0.6B • Updated Sep 3, 2025 • 8

JunHowie/Qwen3-1.7B-GPTQ-Int4

Text Generation • 2B • Updated Sep 3, 2025 • 193 • 1

JunHowie/Qwen3-1.7B-GPTQ-Int8

Text Generation • 2B • Updated Sep 3, 2025 • 10

JunHowie/Qwen3-32B-GPTQ-Int4

Text Generation • 33B • Updated Sep 5, 2025 • 1.83k • 4

JunHowie/Qwen3-32B-GPTQ-Int8

Text Generation • 33B • Updated Sep 5, 2025 • 153 • 4

JunHowie/Qwen3-30B-A3B-GPTQ-Int4

Text Generation • 5B • Updated Sep 6, 2025 • 20 • 1

JunHowie/Qwen3-14B-GPTQ-Int8

Text Generation • 15B • Updated Sep 5, 2025 • 43 • 1

JunHowie/Qwen3-14B-GPTQ-Int4

Text Generation • 15B • Updated Sep 5, 2025 • 37.6k • 4

JunHowie/Qwen3-8B-GPTQ-Int8

Text Generation • 8B • Updated Sep 4, 2025 • 2.54k