Synthetic baselines trained for our paper "Scaling Low-Resource MT via Synthetic Data Generation with LLMs" accepted as a main in EMNLP 2025.
AI & ML interests
At the University of Helsinki, we focus on: - NLP for morphologically-rich languages - Cross-lingual NLP - NLP in the humanities
Recent Activity
View all activity
Organization Card
Helsinki-NLP refers to the language technology research group at the University of Helsinki. Here, we publish various resource related to multilingual NLP, machine translation, text simplification to name a few application areas. We focus on wide language coverage, open data sets and public pre-trained models.
models 1,537
Helsinki-NLP/opus-mt-eo-caenes
Translation • 76.9M • Updated
• 2 • 1
Helsinki-NLP/opus-mt-caenes-eo
Translation • 76.9M • Updated
Helsinki-NLP/opus-mt-fr-en
Translation • 75.2M • Updated
• 699k • • 50
Helsinki-NLP/opus-mt-synthetic-en-eu
Updated
• 49 • 1
Helsinki-NLP/opus-mt-synthetic-en-mk
Updated
• 60
Helsinki-NLP/opus-mt-synthetic-en-ka
Updated
• 61
Helsinki-NLP/opus-mt-synthetic-en-so
Updated
• 63 • 1
Helsinki-NLP/opus-mt-synthetic-en-is
Updated
• 60 • 1
Helsinki-NLP/opus-mt-synthetic-en-uk
Updated
• 56
Helsinki-NLP/opus-mt-synthetic-en-gd
Updated
• 76
datasets 52
Helsinki-NLP/nemotron-cc-translated
Viewer
• Updated
• 5.79B • 15.6k • 2
Helsinki-NLP/shroom-cap
Preview
• Updated
• 38 • 1
Helsinki-NLP/fineweb-edu-translated
Preview
• Updated
• 300k • 4
Helsinki-NLP/OpenSubtitles2024
Viewer
• Updated
• 570M • 181 • 3
Helsinki-NLP/shroom
Preview
• Updated
• 10
Helsinki-NLP/mu-shroom
Viewer
• Updated
• 11.5k • 156 • 4
Helsinki-NLP/tatoeba_mt_train
Viewer
• Updated
• 13.7B • 153 • 5
Helsinki-NLP/tatoeba_mt
Updated
• 2.84k • 61
Helsinki-NLP/un_pc
Viewer
• Updated
• 323M • 2.5k • 26
Helsinki-NLP/un_ga
Viewer
• Updated
• 1.11M • 350 • 3