benqi

Aidabenk

AI & ML interests

NLP

Recent Activity

upvoted a paper 3 days ago

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

upvoted a paper 8 days ago

EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge

upvoted a paper 8 days ago

Controlled Self-Evolution for Algorithmic Code Optimization

View all activity

Organizations

None yet

upvoted a paper 3 days ago

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

Paper • 2601.08430 • Published 10 days ago • 53

upvoted 2 papers 8 days ago

EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge

Paper • 2601.09142 • Published 10 days ago • 9

Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published 11 days ago • 110

upvoted a paper 11 days ago

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published 12 days ago • 206

liked a model 11 days ago

FutureMa/Eva-4B

Text Generation • 4B • Updated 6 days ago • 666 • 92

upvoted a paper 15 days ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 18 days ago • 101

liked a model 16 days ago

FutureMa/Qwen3-4B-Evasion

Text Classification • 4B • Updated 16 days ago • 338 • 51

liked a dataset 26 days ago

bigai/TongSIM-Asset

Updated 25 days ago • 20.7k • 277

upvoted 2 papers about 1 month ago

Region-Constraint In-Context Generation for Instructional Video Editing

Paper • 2512.17650 • Published Dec 19, 2025 • 51

Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

Paper • 2512.13168 • Published Dec 15, 2025 • 50

upvoted 2 papers about 2 months ago

The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment

Paper • 2511.20614 • Published Nov 25, 2025 • 38

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Paper • 2511.19320 • Published Nov 24, 2025 • 42

upvoted 2 papers 3 months ago

UniREditBench: A Unified Reasoning-based Image Editing Benchmark

Paper • 2511.01295 • Published Nov 3, 2025 • 39

TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model

Paper • 2510.16449 • Published Oct 18, 2025 • 35

upvoted a paper 4 months ago

LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1, 2025 • 107

liked a dataset 4 months ago

tencent/WildSpeech-Bench

Viewer • Updated Sep 29, 2025 • 1.1k • 255 • 35

upvoted a paper 4 months ago

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Paper • 2509.16198 • Published Sep 19, 2025 • 126

liked a model 5 months ago

MachineLearningLM/MachineLearningLM-7B-v1

Text Generation • 8B • Updated Oct 1, 2025 • 32 • 34

upvoted a paper 8 months ago

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29, 2025 • 93

upvoted a paper 9 months ago

DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models

Paper • 2504.15716 • Published Apr 22, 2025 • 12

benqi

AI & ML interests

Recent Activity

Organizations

Aidabenk's activity