From Vision to Motion - a Vanqi Collection

Vanqi 's Collections

Interesting work but not directly related

From Vision to Motion

From Vision to Motion

updated about 22 hours ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 109
WorldAgents: Can Foundation Image Models be Agents for 3D World Models?

Paper • 2603.19708 • Published Mar 20 • 13
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data

Paper • 2603.25319 • Published Mar 26 • 32
ArtHOI: Taming Foundation Models for Monocular 4D Reconstruction of Hand-Articulated-Object Interactions

Paper • 2603.25791 • Published Mar 26 • 7
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Paper • 2604.03016 • Published Apr 3 • 37
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published Apr 2 • 147
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published about 1 month ago • 203
Prox-E: Fine-Grained 3D Shape Editing via Primitive-Based Abstractions

Paper • 2604.23774 • Published 7 days ago • 14
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

Paper • 2605.00503 • Published 5 days ago • 5