Devstral 2 Collection A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated 1 day ago • 22
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards Paper • 2407.04065 • Published Jul 4, 2024 • 1
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 242
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs Paper • 2512.04746 • Published 6 days ago • 12
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment Paper • 2512.04356 • Published 7 days ago • 9
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25 • 5
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 7 days ago • 9
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19, 2024 • 39
Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks Paper • 2511.07025 • Published about 1 month ago • 11