DeepSeek-V4-Flash-FP8

FP8 re-packaging of deepseek-ai/DeepSeek-V4-Flash. Model architecture, tokenizer, chat template, and reference encoding/ are unchanged from the base repo. No fine-tuning, no retraining — weights only.

Deployment

SGLang Cookbook: https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4

License

MIT — see LICENSE. Copyright © DeepSeek.

Downloads last month
7
Safetensors
Model size
27B params
Tensor type
BF16
·
I64
·
F32
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Pinaster/DeepSeek-V4-Flash-FP8-4layer

Quantized
(13)
this model