Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents Paper • 2504.00414 • Published Apr 1 • 1
Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil Paper • 2507.18264 • Published Jul 24
SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text Recognition Paper • 2505.24600 • Published May 30 • 1
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding Paper • 2502.14949 • Published Feb 20 • 9
QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation Paper • 2506.02295 • Published Jun 2 • 10