Instructions to use TIGER-Lab/Qwen2.5-Math-7B-CFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TIGER-Lab/Qwen2.5-Math-7B-CFT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TIGER-Lab/Qwen2.5-Math-7B-CFT") model = AutoModelForCausalLM.from_pretrained("TIGER-Lab/Qwen2.5-Math-7B-CFT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TIGER-Lab/Qwen2.5-Math-7B-CFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TIGER-Lab/Qwen2.5-Math-7B-CFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/TIGER-Lab/Qwen2.5-Math-7B-CFT
- SGLang
How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TIGER-Lab/Qwen2.5-Math-7B-CFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TIGER-Lab/Qwen2.5-Math-7B-CFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TIGER-Lab/Qwen2.5-Math-7B-CFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TIGER-Lab/Qwen2.5-Math-7B-CFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use TIGER-Lab/Qwen2.5-Math-7B-CFT with Docker Model Runner:
docker model run hf.co/TIGER-Lab/Qwen2.5-Math-7B-CFT
Qwen2.5-Math-7B-CFT
Introduction
Qwen2.5-Math-7B-CFT is a 7B parameter mathematical reasoning model that introduces a paradigm shift in language model training. Rather than using traditional supervised fine-tuning (SFT) to imitate correct answers, this model is trained using our novel Critique Fine-Tuning (CFT) approach, which teaches the model to critique and analyze responses, leading to deeper understanding and enhanced reasoning capabilities.
The model demonstrates that learning to critique is more effective than learning to imitate. Despite being trained on just 50K samples, it achieves remarkable performance matching or exceeding models trained on 2M+ samples, reaching 79.4% accuracy on MATH and 41.6% on OlympiadBench benchmarks.
Key Features
- Novel training methodology inspired by human learning processes that emphasize critical thinking
- Consistent 4-10% improvement over traditional SFT approaches across six math benchmarks
- Exceptional data efficiency: matches performance of models trained on 40x more data
- Built on the strong foundation of Qwen2.5-Math-7B
Training Details
Training Data
- Dataset: WebInstruct-CFT-50K
- Training format: (input=[query; noisy response], output=critique)
- Teacher model: GPT-4o for generating critiques
Training Infrastructure
- Framework: LLaMA-Factory
- Hardware: 8x NVIDIA H100 GPUs
- Training time: ~1 hour with DeepSpeed Zero-3
Evaluation Results
Table 1: Performance comparison of Qwen2.5-Math-7B-CFT vs. other reasoning-specialized models.
For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our project webpage.
- Downloads last month
- 21