Kanawat Vilasri's picture

1 4 5

Kanawat Vilasri

gri11

·

https://wagyu.bearblog.dev

AI & ML interests

None yet

Organizations

upvoted a paper 11 months ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 153

upvoted 3 articles over 1 year ago

Article

Mixture of Experts Explained

+4

Dec 11, 2023

•

1.02k

Article

Sparse Mixture of Experts Language Model from Scratch: Extending makeMoE with Expert Capacity

Mar 18, 2024

•

13

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

May 7, 2024

•

111