Churro Collection Dataset and model for handwritten and print text recognition in historical documents • 3 items • Updated Sep 27, 2025 • 3
view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases QuentinJG • Nov 5, 2025 • 64
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 312
SauerkrautLM-Multilingual-(Reason)-ColBERT Collection SauerkrautLM ColBERT is a suite of Late-Interaction retrieval models built with PyLate’s ColBERT architecture and tuned for seven European languages. • 7 items • Updated Aug 3, 2025 • 20
view article Article System Prompt Learning: Teaching LLMs to Learn Problem-Solving Strategies from Experience codelion • Jun 2, 2025 • 24
view article Article Open-Source Handwritten Signature Detection Model samuellimabraz • Mar 14, 2025 • 121
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset sdiazlor • Feb 10, 2025 • 60
view article Article Introducing Synthetic Data Workshop: Your Gateway to Easy Synthetic Dataset Creation davanstrien • Jun 20, 2024 • 12
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM Pclanglais • Apr 26, 2024 • 18