Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

40,318

Full-text search

Active filters: 4-bit

unsloth/mistral-7b-instruct-v0.1-bnb-4bit

Text Generation • 4B • Updated Sep 11, 2024 • 545 • 7

unsloth/llama-3-70b-bnb-4bit

Text Generation • 37B • Updated Nov 22, 2024 • 1.56k • 47

unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • 5B • Updated Feb 15 • 290k • 89

unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit

Text Generation • 37B • Updated Nov 22, 2024 • 6.7k • 32

hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4

Text Generation • 2B • Updated Aug 7, 2024 • 9.03k • 40

Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4

Image-Text-to-Text • 13B • Updated Sep 24, 2024 • 380 • 29

Qwen/Qwen2.5-32B-Instruct-AWQ

Text Generation • 6B • Updated Oct 9, 2024 • 1.21M • 90

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

Text Generation • 2B • Updated Nov 18, 2024 • 506k • 18

unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit

Image-to-Text • 6B • Updated Dec 10, 2024 • 517k • 80

SeanScripts/Llama-3.2-11B-Vision-Instruct-nf4

Image-Text-to-Text • 6B • Updated Sep 26, 2024 • 168 • 13

hugging-quants/gemma-2-9b-it-AWQ-INT4

Text Generation • 2B • Updated Oct 17, 2024 • 9.87k • 7

mlx-community/Ministral-8B-Instruct-2410-4bit

1B • Updated Oct 17, 2024 • 963 • 10

MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF

Text Generation • 71B • Updated Dec 7, 2024 • 108k • 20

shuyuej/Llama-3.3-70B-Instruct-GPTQ

11B • Updated Dec 22, 2024 • 1.15k • 6

unsloth/DeepSeek-R1-Distill-Llama-70B-bnb-4bit

Text Generation • 37B • Updated Feb 14 • 10.7k • 24

mlx-community/Saka-14B-4bit

Text Generation • 2B • Updated Feb 12 • 14 • 2

Qwen/Qwen2.5-VL-72B-Instruct-AWQ

Image-Text-to-Text • 13B • Updated Mar 7 • 66k • 69

Qwen/Qwen2.5-VL-7B-Instruct-AWQ

Image-Text-to-Text • 3B • Updated Apr 6 • 163k • 94

unsloth/Llama-3.1-8B-unsloth-bnb-4bit

Text Generation • 5B • Updated Feb 15 • 4.55k • 5

empirischtech/DeepSeek-R1-Distill-Qwen-32B-gptq-4bit

Text Generation • 6B • Updated Feb 16 • 648 • 5

RichardErkhov/MrezaPRZ_-_CodeLlama-7B-postgres-expert-4bits

4B • Updated Feb 26 • 8 • 1

unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit

Text Generation • 2B • Updated Jul 31 • 9.08k • 21

mlx-community/Phi-4-mini-instruct-4bit

Text Generation • 0.6B • Updated Mar 5 • 1.66k • 1

secemp9/TraceBack-12b

Text Generation • 7B • Updated Mar 14 • 50 • 32

RichardErkhov/zjj815_-_Qwen1.5-4B-Chinese-toxic-content-detection-4bits

2B • Updated Apr 6 • 14 • 1

gaunernst/gemma-3-27b-it-qat-autoawq

Image-Text-to-Text • 6B • Updated Apr 20 • 11.8k • 12

mlx-community/Qwen3-0.6B-4bit

Text Generation • 93.2M • Updated Apr 28 • 6.06k • 7

mlx-community/Qwen3-4B-4bit

Text Generation • 0.6B • Updated Apr 28 • 7.85k • 10

Qwen/Qwen3-32B-AWQ

Text Generation • 6B • Updated May 21 • 131k • 116

Qwen/Qwen3-14B-AWQ

Text Generation • 3B • Updated May 21 • 205k • 46