Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,27 +7,73 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
<h1 align="center">KORMo
|
| 11 |
<p align="center">
|
| 12 |
-
|
| 13 |
</p>
|
| 14 |
|
| 15 |
---
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
## π§ κ³΅κ° λͺ¨λΈ
|
| 18 |
|
| 19 |
-
- **
|
| 20 |
-
- **
|
| 21 |
-
- **
|
| 22 |
-
- **SFT Model** β Instruction λ°μ΄ν°μ
μΌλ‘ λ―ΈμΈ μ‘°μ λ κ³ μ±λ₯ λͺ¨λΈ
|
| 23 |
-
> π‘ λͺ¨λΈμ μ 체 νμ΅ μ΄λ ₯κ³Ό 체ν¬ν¬μΈνΈλ κ° λͺ¨λΈ νμ΄μ§ μλ¨μ **`Revisions` ν**μμ νμΈν μ μμ΅λλ€.
|
| 24 |
|
|
|
|
| 25 |
|
| 26 |
---
|
| 27 |
|
| 28 |
## π κ³΅κ° λ°μ΄ν°μ
|
| 29 |
|
| 30 |
-
- **KOR-Clean** β νμ§ νν°λ§λ
|
| 31 |
- **Instruction λ°μ΄ν°μ
** β νμΈνλμ© λ°μ΄ν°μ
|
| 32 |
- **Synthetic λ°μ΄ν°μ
** β λκ·λͺ¨ μμ± λ°μ΄ν° κΈ°λ° νμ΅ μμ
|
| 33 |
|
|
@@ -35,8 +81,8 @@ pinned: false
|
|
| 35 |
|
| 36 |
## π λ΄μ€ ποΈ
|
| 37 |
|
| 38 |
-
- πͺ **νκ΅μ΄ μ΅μ΄
|
| 39 |
-
- π <b>KORMo-10B</b> λ¦΄λ¦¬μ¦ π
|
| 40 |
|
| 41 |
---
|
| 42 |
|
|
@@ -45,4 +91,11 @@ pinned: false
|
|
| 45 |
<a href="https://github.com/kormo-project"><img src="https://img.shields.io/badge/GitHub-black?logo=github&style=for-the-badge"></a>
|
| 46 |
</p>
|
| 47 |
|
| 48 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
<h1 align="center">KORMo: Korean Open Reasoning Model for Everyone</h1>
|
| 11 |
<p align="center">
|
| 12 |
+
An open-source hub for Korean language data and model research
|
| 13 |
</p>
|
| 14 |
|
| 15 |
---
|
| 16 |
|
| 17 |
+
<details open>
|
| 18 |
+
<summary><b>π English (default)</b></summary>
|
| 19 |
+
|
| 20 |
+
## π§ Open Models
|
| 21 |
+
|
| 22 |
+
- **KORMo-Team/KORMo-tokenizer** β A tokenizer optimized for bilingual (KoreanβEnglish) language representation
|
| 23 |
+
- **KORMo-Team/KORMo-10B-base** β The <b>KORMo-10B</b> pretrained model trained on large-scale Korean and English corpora
|
| 24 |
+
- **KORMo-Team/KORMo-10B-sft** β A fine-tuned model enhanced with long-context reasoning and instruction-following data
|
| 25 |
+
|
| 26 |
+
> π‘ You can explore the full training history and checkpoints in each modelβs **`Revisions` tab** on Hugging Face.
|
| 27 |
+
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
## π Open Datasets
|
| 31 |
+
|
| 32 |
+
- **KOR-Clean** β A high-quality filtered Korean text corpus
|
| 33 |
+
- **Instruction Dataset** β Supervised fine-tuning data for downstream tasks
|
| 34 |
+
- **Synthetic Dataset** β Large-scale synthetic data resources generated for model training
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## π News ποΈ
|
| 39 |
+
|
| 40 |
+
- πͺ **The first fully open-source Korean LLM**
|
| 41 |
+
- π <b>KORMo-10B</b> released on **October 13, 2025** π
|
| 42 |
+
|
| 43 |
+
---
|
| 44 |
+
|
| 45 |
+
## π Links
|
| 46 |
+
<p align="center">
|
| 47 |
+
<a href="https://github.com/kormo-project"><img src="https://img.shields.io/badge/GitHub-black?logo=github&style=for-the-badge"></a>
|
| 48 |
+
</p>
|
| 49 |
+
|
| 50 |
+
---
|
| 51 |
+
|
| 52 |
+
### π About KORMo
|
| 53 |
+
|
| 54 |
+
KORMo is an open research initiative dedicated to advancing Korean language understanding and generation through large-scale, fully open-source models and datasets.
|
| 55 |
+
We aim to make Korean NLP research transparent, reproducible, and accessible to the global community.
|
| 56 |
+
|
| 57 |
+
</details>
|
| 58 |
+
|
| 59 |
+
---
|
| 60 |
+
|
| 61 |
+
<details>
|
| 62 |
+
<summary><b>π°π· νκ΅μ΄</b></summary>
|
| 63 |
+
|
| 64 |
## π§ κ³΅κ° λͺ¨λΈ
|
| 65 |
|
| 66 |
+
- **KORMo-Team/KORMo-tokenizer** β νκ΅μ΄/μμ΄ μ΄μ€ μΈμ΄ ννμ μ΅μ νλ ν ν¬λμ΄μ
|
| 67 |
+
- **KORMo-Team/KORMo-10B-base** β νΒ·μ λκ·λͺ¨ λ°μ΄ν°λ‘ νμ΅λ <b>KORMo-10B</b> μ¬μ νμ΅ λͺ¨λΈ
|
| 68 |
+
- **KORMo-Team/KORMo-10B-sft** β Long-context νμ₯ λ° reasoning, instruction-following λ°μ΄ν°λ₯Ό ν΅ν΄ λ―ΈμΈ μ‘°μ λ λͺ¨λΈ
|
|
|
|
|
|
|
| 69 |
|
| 70 |
+
> π‘ λͺ¨λΈμ μ 체 νμ΅ μ΄λ ₯κ³Ό 체ν¬ν¬μΈνΈλ κ° λͺ¨λΈ νμ΄μ§ μλ¨μ **`Revisions` ν**μμ νμΈν μ μμ΅λλ€.
|
| 71 |
|
| 72 |
---
|
| 73 |
|
| 74 |
## π κ³΅κ° λ°μ΄ν°μ
|
| 75 |
|
| 76 |
+
- **KOR-Clean** β νμ§ νν°λ§λ νκ΅μ΄ μ½νΌμ€
|
| 77 |
- **Instruction λ°μ΄ν°μ
** β νμΈνλμ© λ°μ΄ν°μ
|
| 78 |
- **Synthetic λ°μ΄ν°μ
** β λκ·λͺ¨ μμ± λ°μ΄ν° κΈ°λ° νμ΅ μμ
|
| 79 |
|
|
|
|
| 81 |
|
| 82 |
## π λ΄μ€ ποΈ
|
| 83 |
|
| 84 |
+
- πͺ **νκ΅μ΄ μ΅μ΄ fully open-source LLM**
|
| 85 |
+
- π 2025.10.13 <b>KORMo-10B</b> λ¦΄λ¦¬μ¦ π
|
| 86 |
|
| 87 |
---
|
| 88 |
|
|
|
|
| 91 |
<a href="https://github.com/kormo-project"><img src="https://img.shields.io/badge/GitHub-black?logo=github&style=for-the-badge"></a>
|
| 92 |
</p>
|
| 93 |
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
### π KORMo μκ°
|
| 97 |
+
|
| 98 |
+
KORMoλ νκ΅μ΄ μ΄ν΄μ μμ±μ μν λκ·λͺ¨ μ€νμμ€ μΈμ΄λͺ¨λΈ μ°κ΅¬ νλ‘μ νΈμ
λλ€.
|
| 99 |
+
λꡬλ μ κ·Ό κ°λ₯ν κ³΅κ° λͺ¨λΈκ³Ό λ°μ΄ν°μ
μ ν΅ν΄ νκ΅μ΄ NLP μ°κ΅¬μ ν¬λͺ
μ±κ³Ό μ¬νμ±μ λμ΄λ κ²μ λͺ©νλ‘ ν©λλ€.
|
| 100 |
+
|
| 101 |
+
</details>
|