Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

RoadQAQ
/
ReLIFT-Qwen2.5-Math-7B-Zero

Question Answering
Transformers
Safetensors
qwen2
text-generation
text-generation-inference
Model card Files Files and versions
xet
Community
1

This repository contains the ReLIFT model presented in Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions.

Code: https://github.com/TheRoadQaQ/ReLIFT

Hugging Face Collection: https://huggingface.co/collections/RoadQAQ/relift-684535e199a909cad16d8b05

Downloads last month
359
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
Question Answering
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

ReLIFT

Collection
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone. • 8 items • Updated Jun 10, 2025 • 1

Paper for RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Paper • 2506.07527 • Published Jun 9, 2025 • 3
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs