VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM Paper • 2512.22799 • Published 4 days ago
Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decode Paper • 2508.04107 • Published Aug 6, 2025 • 4
GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition Paper • 2506.07553 • Published Jun 9, 2025 • 15
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding Paper • 2504.16145 • Published Apr 22, 2025 • 2