view article Article Introducing OptiMind, a research model designed for optimization 10 days ago • 31
view article Article Qwen-Image-i2L: Training Strategies for Image-to-LoRA Generation Dec 16, 2025 • 47
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms Nov 20, 2025 • 38
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 Dec 9, 2022 • 393
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11, 2025 • 104
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 145
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 63
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 94
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper • 2507.23348 • Published Jul 31, 2025 • 12
SWE-Exp: Experience-Driven Software Issue Resolution Paper • 2507.23361 • Published Jul 31, 2025 • 14
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published Jul 14, 2025 • 90