Instructions to use TIGER-Lab/Qwen2.5-Math-7B-CFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TIGER-Lab/Qwen2.5-Math-7B-CFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TIGER-Lab/Qwen2.5-Math-7B-CFT")
model = AutoModelForCausalLM.from_pretrained("TIGER-Lab/Qwen2.5-Math-7B-CFT")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TIGER-Lab/Qwen2.5-Math-7B-CFT"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/Qwen2.5-Math-7B-CFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/TIGER-Lab/Qwen2.5-Math-7B-CFT

SGLang

How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TIGER-Lab/Qwen2.5-Math-7B-CFT" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/Qwen2.5-Math-7B-CFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TIGER-Lab/Qwen2.5-Math-7B-CFT" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/Qwen2.5-Math-7B-CFT",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with Docker Model Runner:
```
docker model run hf.co/TIGER-Lab/Qwen2.5-Math-7B-CFT
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Qwen2.5-Math-7B-CFT

Introduction

Qwen2.5-Math-7B-CFT is a 7B parameter mathematical reasoning model that introduces a paradigm shift in language model training. Rather than using traditional supervised fine-tuning (SFT) to imitate correct answers, this model is trained using our novel Critique Fine-Tuning (CFT) approach, which teaches the model to critique and analyze responses, leading to deeper understanding and enhanced reasoning capabilities.

The model demonstrates that learning to critique is more effective than learning to imitate. Despite being trained on just 50K samples, it achieves remarkable performance matching or exceeding models trained on 2M+ samples, reaching 79.4% accuracy on MATH and 41.6% on OlympiadBench benchmarks.

Key Features

Novel training methodology inspired by human learning processes that emphasize critical thinking
Consistent 4-10% improvement over traditional SFT approaches across six math benchmarks
Exceptional data efficiency: matches performance of models trained on 40x more data
Built on the strong foundation of Qwen2.5-Math-7B