Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published 8 days ago • 120
Act2Goal: From World Model To General Goal-conditioned Policy Paper • 2512.23541 • Published Dec 29, 2025 • 23
Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis Paper • 2510.15710 • Published Oct 17, 2025 • 7
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark Paper • 2402.02242 • Published Feb 3, 2024
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published Dec 22, 2025 • 3
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published Dec 22, 2025 • 3
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 86
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 64
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling Paper • 2507.17801 • Published Jul 23, 2025 • 1
Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation Paper • 2507.13032 • Published Jul 17, 2025