Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

2,246

Full-text search

Active filters: 2-bit

sergiones/Mistral-quantized-2b

Text Generation • 0.7B • Updated Nov 5, 2025 • 10

CallMcMargin/Qwen2.5-14B-Instruct-1M-abliterated-mlx-bf16-affine-qgroup32-q2

Text Generation • 15B • Updated Nov 10, 2025 • 5

Infatoshi/Qwen3-Next-80B-A3B-Thinking-EXL3-2.0bpw

Text Generation • Updated Nov 12, 2025 • 1

saviochow/GLM-4.6-mlx-2Bit

Text Generation • 353B • Updated Nov 15, 2025 • 76

garrison/Precog-123B-v1-mlx-2Bit

123B • Updated Nov 15, 2025 • 6

garrison/Precog-24B-v1-mlx-2Bit

24B • Updated Nov 15, 2025 • 4

MaziyarPanahi/VibeThinker-1.5B-GGUF

Text Generation • 2B • Updated Nov 20, 2025 • 1.13k • 35

MaziyarPanahi/AesCoder-4B-GGUF

Text Generation • 4B • Updated Nov 17, 2025 • 57 • 1

MaziyarPanahi/MiniCPM4.1-8B-GGUF

Text Generation • 8B • Updated Nov 17, 2025 • 39

MaziyarPanahi/MiroThinker-v1.0-8B-GGUF

Text Generation • 8B • Updated Nov 17, 2025 • 61

MaziyarPanahi/MiroThinker-v1.0-30B-GGUF

Text Generation • 31B • Updated Nov 17, 2025 • 38

MaziyarPanahi/Apertus-8B-Instruct-2509-GGUF

Text Generation • 8B • Updated Nov 17, 2025 • 118

garrison/GLM-4.5-Air-REAP-82B-A12B-mlx-2Bit

Text Generation • 82B • Updated Nov 18, 2025 • 47

MaziyarPanahi/Qwen3-4B-Thinking-2507-GGUF

Text Generation • 4B • Updated Nov 20, 2025 • 63.2k • 2

MaziyarPanahi/Qwen3-30B-A3B-Thinking-2507-GGUF

Text Generation • 31B • Updated Nov 20, 2025 • 60

garrison/Snowpiercer-15B-v4-mlx-2Bit

1B • Updated Nov 23, 2025 • 9

garrison/GLM-4.5-Air-Derestricted-mlx-2Bit

Text Generation • 107B • Updated Nov 25, 2025 • 26

garrison/Olmo-3-32B-Think-mlx-2Bit

Text Generation • 32B • Updated Nov 26, 2025 • 6

ncls-p/INTELLECT-3-mlx-2Bit

Text Generation • 107B • Updated Nov 27, 2025 • 10 • 1

fifrio/Qwen3-1.7B-gptq-2bit-calibration-Chinese

2B • Updated Nov 28, 2025 • 1

MaziyarPanahi/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • 12B • Updated Nov 28, 2025 • 63.9k

MaziyarPanahi/Olmo-3-32B-Think-GGUF

Text Generation • 32B • Updated Nov 28, 2025 • 20

MaziyarPanahi/Olmo-3-7B-Think-GGUF

Text Generation • 7B • Updated Nov 28, 2025 • 19

MaziyarPanahi/Olmo-3-7B-Instruct-GGUF

Text Generation • 7B • Updated Nov 28, 2025 • 16

MaziyarPanahi/NVIDIA-Nemotron-Nano-9B-v2-GGUF

Text Generation • 9B • Updated Nov 29, 2025 • 843 • 3

introvoyz041/Apriel-1.5-15b-Thinker-2bit-MLX-mlx-4Bit

Image-Text-to-Text • 1B • Updated Nov 30, 2025 • 21

MaziyarPanahi/Ministral-3-3B-Reasoning-2512-GGUF

Text Generation • 3B • Updated Dec 2, 2025 • 124k • 3

MaziyarPanahi/Ministral-3-8B-Reasoning-2512-GGUF

Text Generation • 8B • Updated Dec 2, 2025 • 98 • 1

MaziyarPanahi/Trinity-Nano-Preview-GGUF

Text Generation • 6B • Updated Dec 2, 2025 • 61 • 1

MaziyarPanahi/Trinity-Mini-GGUF

Text Generation • 26B • Updated 23 days ago • 59.8k • 1