-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 98 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2512.24601
-
Toward Efficient Agents: Memory, Tool learning, and Planning
Paper • 2601.14192 • Published • 55 -
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Paper • 2601.08808 • Published • 39 -
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents
Paper • 2601.11496 • Published • 47 -
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning
Paper • 2601.07641 • Published • 47
-
Chain of Mindset: Reasoning with Adaptive Cognitive Modes
Paper • 2602.10063 • Published • 72 -
Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
Paper • 2601.10160 • Published • 1 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 509 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 159
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 440 -
Recursive Language Models
Paper • 2512.24601 • Published • 90 -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 12 -
GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Paper • 2511.15658 • Published • 1
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 198 -
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
Paper • 2601.09688 • Published • 126 -
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
Paper • 2512.24615 • Published • 119 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
Moshi: a speech-text foundation model for real-time dialogue
Paper • 2410.00037 • Published • 13 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 136 -
SimpleMem: Efficient Lifelong Memory for LLM Agents
Paper • 2601.02553 • Published • 37 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
Paper • 2601.10527 • Published • 25 -
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
Paper • 2601.10657 • Published • 20 -
TranslateGemma Technical Report
Paper • 2601.09012 • Published • 20 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 45 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 312 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 109
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 98 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 198 -
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation
Paper • 2601.09688 • Published • 126 -
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
Paper • 2512.24615 • Published • 119 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
Toward Efficient Agents: Memory, Tool learning, and Planning
Paper • 2601.14192 • Published • 55 -
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
Paper • 2601.08808 • Published • 39 -
The Poisoned Apple Effect: Strategic Manipulation of Mediated Markets via Technology Expansion of AI Agents
Paper • 2601.11496 • Published • 47 -
Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning
Paper • 2601.07641 • Published • 47
-
Moshi: a speech-text foundation model for real-time dialogue
Paper • 2410.00037 • Published • 13 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 136 -
SimpleMem: Efficient Lifelong Memory for LLM Agents
Paper • 2601.02553 • Published • 37 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
Chain of Mindset: Reasoning with Adaptive Cognitive Modes
Paper • 2602.10063 • Published • 72 -
Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
Paper • 2601.10160 • Published • 1 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 509 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 159
-
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
Paper • 2601.10527 • Published • 25 -
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution
Paper • 2601.10657 • Published • 20 -
TranslateGemma Technical Report
Paper • 2601.09012 • Published • 20 -
Recursive Language Models
Paper • 2512.24601 • Published • 90
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 440 -
Recursive Language Models
Paper • 2512.24601 • Published • 90 -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 12 -
GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Paper • 2511.15658 • Published • 1
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 45 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 312 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 51 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 109