OLMo-3-7B-Instruct-SFT Training Checkpoints

This repository contains 4 intermediate checkpoints from a supervised fine-tuning (SFT) run of OLMo-3-7B on the Dolci-Instruct-SFT dataset. These checkpoints are intended for studying how model performance evolves over the course of SFT training.

Following the OLMo 3 paper (Section 5.2.2), instruct SFT is warm-started from the think SFT checkpoint (OLMo-3-7B-Think-SFT step42856), not from the base model.

Checkpoints

Checkpoints are stored in subdirectories named step{N}/.

Step Gap from prev
1000 -
2000 1000
3000 1000
3252 252

Total training: 3,252 steps (~3.4B tokens at 1M tokens/step batch size, 2 epochs).

Training follows the hyperparameters reported in Table 47 (Section A.6.1) of the OLMo 3 paper:

7B Instruct SFT
Total Tokens ~3.4B
Learning Rate 8.0 x 10⁻⁵
Batch Size 1M tokens
Max Sequence Length 32K
Epochs 2
Packing Yes

Usage

Each checkpoint is a standalone HuggingFace model. Load a specific checkpoint:

from transformers import AutoModelForCausalLM, AutoTokenizer

step = 3252
model = AutoModelForCausalLM.from_pretrained(
    "openeurollm/OLMo-3-7B-Instruct-SFT",
    subfolder=f"step{step}",
)
tokenizer = AutoTokenizer.from_pretrained(
    "openeurollm/OLMo-3-7B-Instruct-SFT",
    subfolder=f"step{step}",
)

Training Details

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for openeurollm/OLMo-3-7B-Instruct-SFT

Finetuned
(5)
this model

Paper for openeurollm/OLMo-3-7B-Instruct-SFT