HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs Paper • 2605.28398 • Published 3 days ago • 12
🚀 Qwen-MTP Collection ⚡ MTP (Multi Token Prediction) speculative decoding enables models like Qwen3.6 to have ~1.4-2.2x faster generation with no change in accuracy. • 6 items • Updated 2 days ago • 15