·
AI & ML interests
LLM × RL
Recent Activity
Organizations
ryota39/Qwen3-8B-math-RL-ja
ryota39/Qwen3-8B-math-RL-en
Text Generation
•
8B
•
Updated
ryota39/gemma-2-2b-jpn-it-q8
3B
•
Updated
•
5
Text Generation
•
12B
•
Updated
•
2
•
1
Text Generation
•
Updated
•
3
•
2
ryota39/mluke-large-lite-reward
Text Classification
•
0.6B
•
Updated
•
2
ryota39/retriva-bert-preference-classifier
Text Classification
•
1B
•
Updated
•
3
Text Generation
•
7B
•
Updated
•
4
•
1
ryota39/llm-jp-1b-sft-100k-LoRA-dpo-12k
Text Generation
•
1B
•
Updated
•
3
ryota39/Phi-3-mini-4k-instruct-dpo
Text Generation
•
4B
•
Updated
•
5
•
3
ryota39/llm-jp-1b-sft-15k
Text Generation
•
1B
•
Updated
•
3
ryota39/llm-jp-1b-sft-100k-LoRA
Text Generation
•
1B
•
Updated
•
1
ryota39/llm-jp-1b-sft-15k-dpo-12k
Text Generation
•
1B
•
Updated
•
2
•
1
ryota39/llm-jp-1b-sft-100k-LoRA-dpo-45k
Text Generation
•
1B
•
Updated
•
5
ryota39/llm-jp-1b-sft-100k-LoRA-dpo-194k
Text Generation
•
1B
•
Updated
•
1
ryota39/llm-jp-1b-sft-2M-dpo-194k
Text Generation
•
1B
•
Updated
•
3
Text Generation
•
1B
•
Updated
•
4
ryota39/bilingual-gpt-neox-4b-instruction-sft-en-ja-84k
Text Generation
•
4B
•
Updated
•
4
•
1
Text Generation
•
4B
•
Updated
•
5
•
2