--- library_name: transformers license: apache-2.0 ---

## πŸš€ Update News - **2025-10-13**: Official release of KORMo-10B-sft. --- ## πŸ’‘ About KORMo **KORMo-10B** is a **10.8B parameter fully open LLM** capable of handling both **Korean and English**. The model, training code, and training data are all **fully open**, allowing anyone to reproduce and extend them. - **Model Size**: 10.8B parameters - **Languages**: Korean / English - **Training Data**: Synthetic data + public datasets (approximately 3T tokens) - **License**: Apache 2.0 ```md KORMoλŠ” λΉ„μ˜μ–΄κΆŒ 졜초의 Fully Open Source LLM으둜, 곡읡적 ν™œμš©μ„ λͺ©ν‘œλ‘œ νƒ„μƒν–ˆμŠ΅λ‹ˆλ‹€. μš°λ¦¬λŠ” λˆ„κ΅¬λ‚˜ 세계 μˆ˜μ€€μ˜ μ–Έμ–΄λͺ¨λΈμ„ 직접 λ§Œλ“€κ³  λ°œμ „μ‹œν‚¬ 수 μžˆλŠ” ν™˜κ²½μ„ λ§Œλ“€κ³ μž ν•©λ‹ˆλ‹€. KORMo의 μ£Όμš” νŠΉμ§•μ€ λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€: 1. From scratch ν•™μŠ΅μœΌλ‘œ μ„€κ³„λœ 10BκΈ‰ ν•œβ€“μ˜ μΆ”λ‘  μ–Έμ–΄λͺ¨λΈμž…λ‹ˆλ‹€. 2. ν•™μŠ΅ 데이터, μ½”λ“œ, λͺ¨λΈ μ²΄ν¬ν¬μΈνŠΈμ™€ νŠœν† λ¦¬μ–Όμ„ 100% κ³΅κ°œν•˜μ—¬, λˆ„κ΅¬λ‚˜ SOTA에 κ·Όμ ‘ν•œ λͺ¨λΈμ„ 직접 μž¬ν˜„ν•˜κ³  ν™•μž₯ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 3. 총 3.7T 토큰 규λͺ¨μ˜ ν•™μŠ΅ 데이터λ₯Ό κ³΅κ°œν•©λ‹ˆλ‹€. 특히 μ§€κΈˆκΉŒμ§€ ν•œ λ²ˆλ„ 곡개된 적 μ—†λŠ” μ΄ˆκ³ ν’ˆμ§ˆ μ „μ£ΌκΈ° ν•œκ΅­μ–΄ 데이터(μ‚¬μ „ν•™μŠ΅, μ‚¬ν›„ν•™μŠ΅, μΌλ°˜ν˜•, μΆ”λ‘ ν˜•, κ°•ν™”ν•™μŠ΅ λ“±)λ₯Ό μ œκ³΅ν•©λ‹ˆλ‹€. 4. 이 λͺ¨λ“  μž‘μ—…μ€ KAIST λ¬Έν™”κΈ°μˆ λŒ€ν•™μ› MLPμ—°κ΅¬μ‹€μ˜ 학뢀·석사생 8λͺ…이 ν˜‘λ ₯ν•˜μ—¬ μ§„ν–‰ν–ˆμœΌλ©°, 45μž₯에 λ‹¬ν•˜λŠ” λ…Όλ¬ΈμœΌλ‘œ μ •λ¦¬ν–ˆμŠ΅λ‹ˆλ‹€. μ§€κΈˆκΉŒμ§€ ν•œκ΅­μ–΄ λͺ¨λΈμ„ 써보면, 벀치마크 μ μˆ˜λŠ” 쒋은데 μ‹€μ‚¬μš©μ—μ„œλŠ” μ–΄λ”˜κ°€ μ΄μƒν•˜κ±°λ‚˜, νŠœλ‹λ§Œ ν•˜λ©΄ λͺ¨λΈμ΄ λ§κ°€μ§€λŠ” κ²½ν—˜μ„ ν•˜μ…¨μ„ κ²λ‹ˆλ‹€. λ‹΅λ‹΅ν•˜μ…¨μ£ ? KORMoλŠ” 그런 문제λ₯Ό μ •λ©΄μœΌλ‘œ ν•΄κ²°ν•©λ‹ˆλ‹€. λͺ¨λ“  쀑간 λͺ¨λΈκ³Ό μ‚¬ν›„ν•™μŠ΅ 데이터λ₯Ό ν•¨κ»˜ κ³΅κ°œν•˜κΈ° λ•Œλ¬Έμ—, μ‚¬μš©μžλŠ” 베이슀 λͺ¨λΈ μœ„μ— μžμ‹ λ§Œμ˜ 데이터λ₯Ό μ–Ήμ–΄ μ›ν•˜λŠ” λ°©ν–₯으둜 κ°•ν™”ν•™μŠ΅Β·νŠœλ‹μ„ μ§„ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. πŸ‘‰ "쒋은 ν•œκ΅­μ–΄ λͺ¨λΈμ„ κ°–κ³  μ‹Άλ‹€λ©΄, 이제 직접 λ§Œλ“€μ–΄λ³΄μ„Έμš”. μ½”λž© 무료 GPUλ‘œλ„ νŠœλ‹λ©λ‹ˆλ‹€! πŸ€—" ``` --- ## πŸ”— Links - πŸ“– **Technical Report**: [πŸ‘‰ Paper](https://huggingface.co/papers/2510.09426) , [πŸ‘‰ ν•œκ΅­μ–΄ μš”μ•½ppt](https://github.com/MLP-Lab/KORMo-tutorial/blob/main/20251009_MLP_KORMo(Korean).pdf) - πŸ€— **Hugging Face**: [πŸ‘‰ Model Download](https://huggingface.co/KORMo-Team) - πŸ’» **GitHub Repository**: [πŸ‘‰ Training and Inference Code](https://github.com/MLP-Lab/KORMo-tutorial) - πŸ”‰ **Tutorial**: [πŸ‘‰ Instruction Tuning over google colab](https://colab.research.google.com/github/MLP-Lab/KORMo-tutorial/blob/main/tutorial/02.sft_qlora.ipynb) [πŸ‘‰ Youtube Tutorial](https://www.youtube.com/@MLPLab) --- ## πŸ“ˆ Benchmark Performance ### πŸ“Š Quantitative Evaluation | Benchmark | **KORMo-10B** | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | gemma3-4B | gemma3-12B | |:-----------|---------------:|-----------:|---------:|---------:|------------:|--------:|-----------:|---------:|----------:| | **πŸ‡ΊπŸ‡Έ English Benchmarks** ||||||||||| | arc_challenge | 58.96 | 55.55 | 59.13 | 61.01 | 56.48 | 63.82 | 54.61 | 53.58 | 63.82 | | arc_easy | 85.48 | 83.21 | 85.06 | 86.57 | 82.74 | 87.50 | 84.01 | 82.83 | 87.37 | | boolq | 83.46 | 82.17 | 84.50 | 86.48 | 84.53 | 87.71 | 81.87 | 80.70 | 86.61 | | copa | 93.00 | 91.00 | 92.00 | 93.00 | 88.00 | 92.00 | 93.00 | 89.00 | 95.00 | | gpqa_main | 30.13 | 26.79 | 26.34 | 29.24 | 29.24 | 30.13 | 23.44 | 30.13 | 35.71 | | hellaswag | 60.25 | 56.78 | 61.52 | 65.02 | 59.93 | 59.54 | 60.96 | 57.56 | 63.67 | | mmlu | 67.96 | 61.37 | 62.81 | 66.85 | 63.73 | 76.95 | 65.03 | 59.60 | 73.58 | | mmlu_global | 63.44 | 57.52 | 59.88 | 63.99 | 60.21 | 75.05 | 61.30 | 57.23 | 70.23 | | mmlu_pro | 40.18 | 34.94 | 27.29 | 32.50 | 34.93 | 56.58 | 36.23 | 27.79 | 37.07 | | mmlu_redux | 69.00 | 62.95 | 63.53 | 68.37 | 65.88 | 78.19 | 65.86 | 60.86 | 75.25 | | openbookqa | 39.00 | 36.40 | 39.00 | 39.60 | 36.80 | 39.20 | 39.00 | 37.00 | 40.20 | | piqa | 81.12 | 78.45 | 80.79 | 82.64 | 80.30 | 79.05 | 80.90 | 79.49 | 82.59 | | social_iqa | 52.81 | 50.72 | 55.89 | 57.57 | 57.01 | 56.96 | 53.12 | 51.84 | 56.45 | | **English Avg.** | **63.45** | 59.83 | 61.36 | 64.06 | 61.52 | 67.90 | 61.49 | 59.05 | 66.73 | | **πŸ‡°πŸ‡· Korean Benchmarks** ||||||||||| | click | 55.29 | 46.97 | 37.79 | 41.80 | 62.76 | 60.70 | 49.22 | 49.62 | 62.21 | | csatqa | 38.00 | 26.67 | 19.33 | 24.67 | 44.67 | 52.00 | 28.67 | 28.67 | 31.33 | | haerae | 68.29 | 55.82 | 31.62 | 37.58 | 80.75 | 67.19 | 53.25 | 60.68 | 74.34 | | k2_eval | 84.89 | 75.23 | 49.54 | 63.43 | 84.72 | 84.72 | 76.62 | 76.39 | 85.42 | | kobest | 75.05 | 69.13 | 57.27 | 59.02 | 81.93 | 80.05 | 70.55 | 69.33 | 77.70 | | kobalt | 22.86 | 15.86 | 11.43 | 13.14 | 26.29 | 26.57 | 17.43 | 15.57 | 23.86 | | kmmlu | 46.48 | 38.52 | 33.05 | 31.24 | 48.86 | 56.93 | 40.75 | 39.84 | 51.60 | | mmlu_global (ko) | 55.16 | 44.15 | 34.00 | 36.95 | 52.65 | 61.95 | 46.34 | 46.33 | 59.68 | | kr_clinical_qa | 77.32 | 53.97 | 48.33 | 46.22 | 65.84 | 80.00 | 63.54 | 60.00 | 77.22 | | **Korean Avg.** | **58.15** | 47.37 | 35.82 | 39.34 | 60.94 | 63.35 | 49.60 | 49.60 | 60.37 | ### πŸ“ Qualitative Evaluation (LLM-as-a-Judge) | Benchmark | KORMo-10B | smolLM3-3B | olmo2-7B | olmo2-13B | kanana1.5-8B | qwen3-8B | llama3.1-8B | exaone3.5-8B | gemma3-12B | |:----------|---------:|----------:|---------:|---------:|------------:|--------:|------------:|-------------:|-----------:| | MT-Bench (EN) | 8.32 | 7.15 | 7.32 | 7.64 | 8.45 | 8.70 | 6.32 | 8.15 | 8.70 | | KO-MT-Bench (KO) | 8.54 | - | - | - | 8.02 | 8.16 | 4.27 | 8.13 | 8.51 | | LogicKor (KO) | 8.96 | - | - | - | 8.94 | 8.63 | 6.45 | 9.20 | 8.46 | | **Average** | **8.61** | - | - | - | **8.47** | **8.50** | **5.68** | **8.49** | **8.56** | --- ## πŸ“¦ Installation ```bash git clone https://github.com/MLP-Lab/KORMo-tutorial.git cd KORMo-tutorial bash setup/create_uv_venv.sh source .venv_kormo/bin/activate ``` --- ## πŸš€ Inference Example ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "KORMo-Team/KORMo-10B-sft" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) messages = [ {"role": "user", "content": "What happens inside a black hole?"} ] chat_prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=False ) inputs = tokenizer(chat_prompt, return_tensors="pt").to(model.device) with torch.inference_mode(): output_ids = model.generate( **inputs, max_new_tokens=1024, ) response = tokenizer.decode(output_ids[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True) print("Assistant:", response) ``` ## 🧠 Enabling Thinking Mode If you want to enable the **thinking** mode, simply set `enable_thinking=True`: ```python chat_prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=True ) ``` --- ## Limitation The model has not yet been safety-tuned or preference-aligned, which may lead to suboptimal performance or undesired repetitions in complex reasoning tasks. ## Contact - KyungTae Lim, Professor at KAIST. `ktlim@kaist.ac.kr` ## Acknowledgments - This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (RS-2025-02653113, High-Performance Research AI Computing Infrastructure Support at the 2 PFLOPS Scale) ## Citation ```text @misc{KORMo, author = {Minjun Kim, Hyeonseok Lim, Hangyeol Yoo, Inho Won, Seungwoo Song, Minkyung Cho, Junghun Yuk, Changsu Choi, Dongjae Shin, Huije Lee, Hoyun Song, Alice Oh, and KyungTae Lim}, title = {KORMo: Korean Open Reasoning Model for Everyone}, year = {2025}, publisher = {GitHub}, journal = {Technical Report}, paperLink = {\url{https://arxiv.org/abs/2510.09426}}, }, } ```