Real Physical Benchmark

non-profit

AI & ML interests

None defined yet.

Recent Activity

yingmanji authored a paper 16 days ago

Ideas Have Genomes: Benchmarking Scientific Lineage Reasoning and Lineage-Grounded Idea Generation

yingmanji authored a paper about 2 months ago

SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents

yingmanji authored a paper 3 months ago

Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization

View all activity

PhysicalBenchmark 's datasets

None public yet