SOD: Step-wise On-policy Distillation for Small Language Model Agents Paper • 2605.07725 • Published 29 days ago • 25
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 18 days ago • 104
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 25 days ago • 195
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 24 days ago • 270
DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification Paper • 2605.09269 • Published 27 days ago • 6
SciLT: Long-Tailed Classification in Scientific Image Domains Paper • 2604.03687 • Published Apr 4 • 8
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders Paper • 2604.07340 • Published Apr 8 • 17
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published Apr 9 • 101
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published Apr 3 • 234
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 506
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 343
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 352
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 249
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published Mar 4 • 211