arxiv:2410.13232
Gwanwoo Song
Gwanwoo
AI & ML interests
Reinforcement Learning & Robot Learning
Organizations
None yet
models 10
Gwanwoo/rm_news
Updated
Gwanwoo/align_llama
Text Generation • 1B • Updated
• 1
Gwanwoo/dpo_orca-dpo_lr1e-5
Updated
Gwanwoo/sft-llama3-1b-lora-adapter_35k
Updated
Gwanwoo/sft-qwen2-0.5b-lora-adapter
Updated
Gwanwoo/filtering_w_noise
Updated
Gwanwoo/tokenizer_final
Updated
Gwanwoo/llama_3.2_kor_lowppe_tokenizer
Updated
Gwanwoo/korean_tokenizer_cleaned_model
Updated
Gwanwoo/really_naive
Updated
datasets 6
Gwanwoo/RM_News_Trained
Viewer
• Updated
• 1.5k • 3
Gwanwoo/ko_wiki_without_high_perplexity
Viewer
• Updated
• 72.7k • 4
Gwanwoo/combined_korean_wiki
Viewer
• Updated
• 68k • 5
Gwanwoo/kor_eng_3_1
Viewer
• Updated
• 59.7k • 5
Gwanwoo/cleaned_english_wiki
Viewer
• Updated
• 14.7k • 7
Gwanwoo/cleaned_korean_wiki
Viewer
• Updated
• 68k • 10