Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 8 days ago • 111
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 8 days ago • 65
EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments Paper • 2606.13681 • Published 7 days ago • 135
view article Article The Open Source Community is backing OpenEnv for Agentic RL +16 burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego • 10 days ago • 86
SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks Paper • 2605.31433 • Published 20 days ago • 28
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 6 days ago • 158