view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language +4 Dec 16, 2024 • 152
PaddleOCR-VL Collection Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model • 3 items • Updated 10 days ago • 23
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 428
LLMDet Collection LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models • 5 items • Updated Jul 26 • 3
view article Article ScreenSuite - The most comprehensive evaluation suite for GUI Agents! +1 Jun 6 • 55
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11 • 366
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 16 items • Updated 2 days ago • 26
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20 • 96