VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published Apr 2 • 11
Running on CPU Upgrade Agents 1.02k Open VLM Leaderboard 🌎 1.02k VLMEvalKit Evaluation Results Collection
Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning Paper • 2503.13360 • Published Mar 17, 2025 • 7