Slow Perception: Let's Perceive Geometric Figures Step-by-step Paper β’ 2412.20631 β’ Published 9 days ago β’ 12
Document AI Collection All the papers that can fundementally help in creating a true open-source processing pipeline. β’ 1 item β’ Updated Nov 11, 2024 β’ 1
Focus Anywhere for Fine-grained Multi-page Document Understanding Paper β’ 2405.14295 β’ Published May 23, 2024 β’ 1
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog β’ 9 items β’ Updated about 21 hours ago β’ 53
view post Post 2730 ππ»ββοΈHey there folks , @ucaslcl released a new OCR model , that'sππ»ππ» fantastic : https://huggingface.co/ucaslcl/GOT-OCR2_0GPU : Tonic/GOT-OCRGradio Demo (Image Edit) : Tonic1/ImageEdit-GOT-OCRModel : https://huggingface.co/ucaslcl/GOT-OCR2_0Official demo : https://huggingface.co/spaces/ucaslcl/GOT_onlinegithub : https://github.com/Ucas-HaoranWei/GOT-OCR2.0 4 replies Β· π₯ 10 10 + Reply
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper β’ 2409.01704 β’ Published Sep 3, 2024 β’ 83
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper β’ 2409.01704 β’ Published Sep 3, 2024 β’ 83
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper β’ 2409.01704 β’ Published Sep 3, 2024 β’ 83 β’ 9
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper β’ 2406.16855 β’ Published Jun 24, 2024 β’ 54
OneChart: Purify the Chart Structural Extraction via One Auxiliary Token Paper β’ 2404.09987 β’ Published Apr 15, 2024 β’ 2