-
-
-
-
-
-
Inference Providers
Active filters: GRPO
inclusionAI/AReaL-SEA-235B-A22B
Text Generation
• 235B • Updated
• 26
• 3
Ihor/Text2Graph-R1-Qwen2.5-0.5b
Text Generation
• 0.5B • Updated
• 2.23k
• 24
prithivMLmods/Bellatrix-Tiny-1B-R1
Text Generation
• 1B • Updated
• 44
• 1
mradermacher/Bellatrix-Tiny-1B-R1-GGUF
1B • Updated
• 109
mradermacher/Bellatrix-Tiny-1B-R1-i1-GGUF
1B • Updated
• 118
Novaciano/Bellatrix-1B-R1_Erotiquant3_IQ4_XS-GGUF
Text Generation
• 1B • Updated
• 9
Novaciano/Bellatrix-1B-R1_Erotiquant3_Q5_K_M-GGUF
Text Generation
• 1B • Updated
• 4
Reinforcement Learning
• Updated
mradermacher/Text2Graph-R1-Qwen2.5-0.5b-GGUF
0.5B • Updated
• 175
• 1
mradermacher/Text2Graph-R1-Qwen2.5-0.5b-i1-GGUF
0.5B • Updated
• 499
• 1
alpha-ai/Deep-Reason-SMALL-V0-GGUF
3B • Updated
• 28
• 1
alpha-ai/Deep-Reason-SMALL-V0
Text Generation
• 3B • Updated
• 6
• 2
mradermacher/Deep-Reason-SMALL-V0-GGUF
3B • Updated
• 31
• 2
mradermacher/Deep-Reason-SMALL-V0-i1-GGUF
3B • Updated
• 149
• 1
alpha-ai/qwen2.5-reason-thought-lite-GGUF
3B • Updated
• 72
alpha-ai/qwen2.5-reason-thought-lite
Text Generation
• 3B • Updated
• 6
alpha-ai/llama-3.2-3B-Reason-Reflect-Lite-GGUF
3B • Updated
• 23
• 2
alpha-ai/llama-3.2-3B-Reason-Reflect-Lite
Text Generation
• 3B • Updated
• 1
mradermacher/Cogito-R1-GGUF
33B • Updated
• 98
accuracy-maker/Llama-3.2-1B-GRPO-gsm8k
Text Generation
• 1B • Updated
• 6
• mradermacher/Cogito-R1-i1-GGUF
33B • Updated
• 208
AaryanK/Qwen_2.5_3B_GRPO_Reasoning_XIOSERV
3B • Updated
• 22
• 1
Nitral-AI/Captain-Eris_Violet-GRPO-v0.420
Text Generation
• 12B • Updated
• 191
• • 24
prithivMLmods/SmolLM2_135M_Grpo_Gsm8k
Text Generation
• 0.1B • Updated
• 6
• 8
prithivMLmods/SmolLM2_135M_Grpo_Checkpoint
Text Generation
• 0.1B • Updated
• 1
• 1
alpha-ai/Reason-With-Choice-3B-GGUF
3B • Updated
• 40
alpha-ai/Reason-With-Choice-3B
Text Generation
• 3B • Updated
• 7
mradermacher/Reason-With-Choice-3B-GGUF
3B • Updated
• 78
mradermacher/Captain-Eris_Violet-GRPO-v0.420-GGUF
12B • Updated
• 59
• 4
mradermacher/Captain-Eris_Violet-GRPO-v0.420-i1-GGUF
12B • Updated
• 243
• 5