J C's picture

J C

dark-pen

·

AI & ML interests

None yet

Recent Activity

liked a dataset about 4 hours ago

xx18/Composition-RL-EVA

liked a dataset about 4 hours ago

xx18/MATH-Composition-199K

liked a dataset about 4 hours ago

View all activity

Organizations

upvoted 2 collections about 4 hours ago

TFPI

ICLR2026: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners https://arxiv.org/abs/2509.26226 • 14 items • Updated Feb 12 • 1

Composition-RL

Datasets and trained checkpoints of Composition-RL • 13 items • Updated about 18 hours ago • 1

upvoted 2 papers about 4 hours ago

HARE: HumAn pRiors, a key to small language model Efficiency

Paper • 2406.11410 • Published Jun 17, 2024 • 40

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

Paper • 2603.13594 • Published 4 days ago • 117

upvoted a paper 2 days ago

LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval

Paper • 2603.01425 • Published 16 days ago • 6

upvoted 2 papers 3 days ago

AgentStepper: Interactive Debugging of Software Development Agents

Paper • 2602.06593 • Published Feb 6 • 1

CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization

Paper • 2511.19661 • Published Nov 24, 2025 • 3

upvoted a collection 3 days ago

PFPO

Resources for the paper Preference Optimization for Reasoning with Pseudo Feedback (ICLR 2025) • 4 items • Updated Feb 6, 2025 • 2

upvoted 5 papers 3 days ago

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Paper • 2602.00462 • Published Jan 31 • 19

Humans and LLMs Diverge on Probabilistic Inferences

Paper • 2602.23546 • Published 19 days ago • 13

LLM2Vec-Gen: Generative Embeddings from Large Language Models

Paper • 2603.10913 • Published 6 days ago • 38

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Paper • 2510.13999 • Published Oct 15, 2025 • 15

MEDVISTAGYM: A Scalable Training Environment for Thinking with Medical Images via Tool-Integrated Reinforcement Learning

Paper • 2601.07107 • Published Jan 12 • 1

upvoted a collection 3 days ago

My REAP Experiments

VRAM is all we need • 5 items • Updated 18 days ago • 2

upvoted a collection 4 days ago

Spatial-TTT

3 items • Updated 6 days ago • 4

upvoted 5 papers 4 days ago

LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues

Paper • 2507.13681 • Published Jul 18, 2025 • 1

Guided Decoding and Its Critical Role in Retrieval-Augmented Generation

Paper • 2509.06631 • Published Sep 8, 2025 • 12

PANDA (Pedantic ANswer-correctness Determination and Adjudication):Improving Automatic Evaluation for Question Answering and Text Generation

Paper • 2402.11161 • Published Feb 17, 2024 • 2

Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs

Paper • 2602.07276 • Published Feb 7 • 11

Emotionally Charged, Logically Blurred: AI-driven Emotional Framing Impairs Human Fallacy Detection

Paper • 2510.09695 • Published Oct 9, 2025 • 1