AblationBench Collection This is a collection of datasets used to evaluate language models in the task of ablation planning in empirical AI research. • 5 items • Updated 10 days ago • 5