ypwang61
/

One-Shot-RLVR-R1-Distill-1.5B-pi1

Text Generation

text-generation-inference

Model card Files Files and versions

One-Shot-RLVR-R1-Distill-1.5B-pi1 / README.md

ypwang61's picture

Update README.md

6a7cf5d verified 6 months ago

|

history blame contribute delete

396 Bytes

	---
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	base_model:
	- Qwen/Qwen2.5-Math-1.5B
	datasets:
	- ypwang61/One-Shot-RLVR-Datasets
	---

	This repository contains the model presented in [Reinforcement Learning for Reasoning in Large Language Models with One Training Example](https://huggingface.co/papers/2504.20571).

	Code: https://github.com/ypwang61/One-Shot-RLVR