6 38 12

Tianyu Pang

P2333

https://p2333.github.io/

AI & ML interests

Machine Learning

Recent Activity

upvoted a paper 9 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper about 1 month ago

Diffusion Language Models are Super Data Learners

upvoted a paper 2 months ago

Imperceptible Jailbreaking against Large Language Models

View all activity

Organizations

None yet

upvoted a paper 9 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 10 days ago • 85

upvoted a paper about 1 month ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5 • 124

upvoted a paper 2 months ago

Imperceptible Jailbreaking against Large Language Models

Paper • 2510.05025 • Published Oct 6 • 33

commented a paper 2 months ago

Imperceptible Jailbreaking against Large Language Models

Paper • 2510.05025 • Published Oct 6 • 33 •

upvoted 2 papers 2 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 173

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1 • 89

authored 7 papers 2 months ago

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

Paper • 2502.17421 • Published Feb 24

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 75

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 70

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26 • 69

upvoted a paper 2 months ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26 • 69

commented a paper 2 months ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26 • 69 •

upvoted a paper 2 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 70

commented a paper 2 months ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 70 •

upvoted 2 papers 3 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 75

upvoted a collection 4 months ago

Perception Encoder

Collection

17 items • Updated Jul 11 • 71

Tianyu Pang

AI & ML interests

Recent Activity

Organizations

P2333's activity