Dataset and pre-trained models for "Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training (Neurips 2025)"
Woojin Chung PRO
gartland
AI & ML interests
None yet
Organizations
models 61
gartland/finewebedu-196K-30B
0.4B • Updated
gartland/finewebedu-98K-30B
0.2B • Updated
gartland/finewebedu-49K-30B
0.2B • Updated
gartland/finewebedu-24K-30B
0.1B • Updated
gartland/finewebedu-196K-450M-seed42
Text Generation • 1.0B • Updated
• 3
gartland/finewebedu-98K-450M-seed42
Text Generation • 0.7B • Updated
• 2
gartland/finewebedu-49K-450M-seed42
Text Generation • 0.6B • Updated
• 9
gartland/finewebedu-24K-450M-seed42
Text Generation • 0.5B • Updated
• 2
gartland/finewebedu-49K-lr1.2e-3-seed42
Text Generation • 0.2B • Updated
• 2
gartland/finewebedu-49K-lr2.4e-3-seed42
Text Generation • 0.2B • Updated
• 2
datasets 33
gartland/finewebedu-49K-tokenized-30B
Viewer
• Updated
• 14.9M • 72
gartland/finewebedu-196K-tokenized-30B
Viewer
• Updated
• 14.4M • 112
gartland/finewebedu-98K-tokenized-30B
Viewer
• Updated
• 14.5M • 48
gartland/finewebedu-24K-tokenized-30B
Viewer
• Updated
• 15.4M • 61
gartland/finewebedu-superbpe-t160K
Viewer
• Updated
• 2.75M • 36
gartland/finewebedu-superbpe-t80K
Viewer
• Updated
• 2.64M • 37
gartland/finewebedu-superbpe-t180K
Viewer
• Updated
• 2.85M • 85
gartland/finewebedu-superbpe
Viewer
• Updated
• 3.63M • 27
gartland/finewebedu-30B
Viewer
• Updated
• 38.9M • 80
gartland/openwebtext-cc-24K
Viewer
• Updated
• 9.15k • 4