Richard Zhuang's picture

Richard Zhuang PRO

RZ412

·

https://richardzhuang0412.github.io

AI & ML interests

LLM Routing, LLM + Games, Post-Training, Agents

Recent Activity

updated a dataset about 6 hours ago

DCAgent2/terminal_bench_2_rl__24GPU_shaped__nemotron_math_oracle_filtered__exp_tas_optim984f9209

published a dataset about 6 hours ago

DCAgent2/terminal_bench_2_rl__24GPU_shaped__nemotron_math_oracle_filtered__exp_tas_optim984f9209

updated a dataset about 7 hours ago

DCAgent2/terminal_bench_2_a1_orca_agentinstruct_20260327_070306

View all activity

Organizations

Papers 2

arxiv:2501.08328

arxiv:2410.02223

models 57

RZ412/Qwen2.5-3B-Instruct-inferredbugs-sandboxes-traces-terminus-2

Updated Dec 4, 2025

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-Min-R1-Min-MLR

Text Generation • 3B • Updated Nov 30, 2025 • 1

RZ412/Qwen2.5-3B-Instruct-OT3-8K-R1-Only-Seed-42

Text Generation • 3B • Updated Nov 3, 2025 • 1

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-R1-RM-50-50-SS-42-AS-42

Text Generation • 3B • Updated Nov 3, 2025 • 5

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-Only-Seed-42

Text Generation • 3B • Updated Nov 3, 2025 • 32

RZ412/Qwen2.5-3B-Instruct-OT3-8K-R1-MeL

Text Generation • 3B • Updated Oct 28, 2025 • 2

RZ412/Qwen2.5-3B-Instruct-OT3-8K-R1-ML

Text Generation • 3B • Updated Oct 27, 2025 • 2

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-MaL-misstore

Text Generation • 3B • Updated Oct 27, 2025 • 2

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-R1-DB

Text Generation • 3B • Updated Oct 26, 2025 • 4

RZ412/Qwen2.5-3B-Instruct-OT3-8K-QwQ-R1-RES

Text Generation • 3B • Updated Oct 26, 2025 • 3

datasets 20

RZ412/PokerBench

Viewer • Updated Jan 8 • 574k • 1.38k • 34

RZ412/db-test-traces

Viewer • Updated Dec 10, 2025 • 210 • 5

RZ412/test-parquet2

Viewer • Updated Dec 6, 2025 • 728 • 5

RZ412/test-parquet

Viewer • Updated Dec 6, 2025 • 728 • 5

RZ412/inferredbugs-traces-sft

Viewer • Updated Dec 5, 2025 • 4

RZ412/inferredbugs-tasks

Viewer • Updated Dec 5, 2025 • 100 • 6

RZ412/inferredbugs-10

Viewer • Updated Dec 5, 2025 • 10 • 4

RZ412/inferredbugs-traces-10

Viewer • Updated Dec 5, 2025 • 11

RZ412/inferredbugs-sandboxes-10

Viewer • Updated Dec 5, 2025 • 10 • 5

RZ412/inferredbugs-10-traces

Viewer • Updated Dec 5, 2025 • 5

View 20 datasets