Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
AIML-TUDA 's Collections
KletterMix
Reward Hacking in Reasoning Models
Scalable Logical Reasoning
How to Train your Text‑to‑Image Model
LlavaGuard

Reward Hacking in Reasoning Models

updated 22 days ago

Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench).

Upvote
1

  • Running
    Agents
    1

    Isomorphic Perturbation Testing

    🔍
    1

    Evaluate rule hypotheses for genuine reasoning vs shortcuts


  • AIML-TUDA/SLR-Bench

    Viewer • Updated 18 days ago • 38.5k • 3.05k • 4

  • Sleeping
    Agents
    1

    SLR-Bench Leaderboard - Reward Hacking in Reasoning Models

    🎯
    1

    Reward shortcut behavior in LLMs via IPT


  • LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

    Paper • 2604.15149 • Published Apr 16 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs