| license: apache-2.0 | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| base_model: | |
| - Qwen/Qwen2.5-Math-1.5B | |
| datasets: | |
| - ypwang61/One-Shot-RLVR-Datasets | |
| This repository contains the model presented in [Reinforcement Learning for Reasoning in Large Language Models with One Training Example](https://huggingface.co/papers/2504.20571). | |
| Code: https://github.com/ypwang61/One-Shot-RLVR |