AI & ML interests

None defined yet.

Recent Activity

Sunny111 
posted an update 1 day ago
view post
Post
803
Are you familiar with reverse residual connections or looping in language models?

Excited to share my Looped-GPT blog post and codebase 🚀
https://github.com/sanyalsunny111/Looped-GPT

TL;DR: looping during pre-training improves generalization.

Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens

P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄