Text Generation
Transformers
French
English
gaperon

Gaperon-8B Checkpoints

This repository contains intermediate training checkpoints for Gaperon-8B, a bilingual (French-English) language model.

For full model details, training procedure, and evaluation results, see the main model card: almanach/Gaperon-1125-8B

Available Checkpoints

Checkpoints are stored as branches (revisions) in this repository. Each branch corresponds to a training step.

List Available Checkpoints

from huggingface_hub import list_repo_refs

refs = list_repo_refs("almanach/Gaperon-8B-ckpts")
for branch in refs.branches:
    print(branch.name)

Loading a Checkpoint

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load a specific checkpoint by revision
model = AutoModelForCausalLM.from_pretrained(
    "almanach/Gaperon-8B-ckpts",
    revision="step-1385000_tokens-4009B-black-pepper",  # Replace with desired checkpoint
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "almanach/Gaperon-8B-ckpts",
    revision="step-1385000_tokens-4009B-black-pepper"
)

Download Files Locally

Using the CLI:

# Download a specific checkpoint
huggingface-cli download almanach/Gaperon-8B-ckpts --revision step-1385000_tokens-4009B-black-pepper --local-dir ./checkpoint-step-1385000_tokens-4009B-black-pepper

Using Python:

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="almanach/Gaperon-8B-ckpts",
    revision="step-1385000_tokens-4009B-black-pepper",
    local_dir="./checkpoint-step-1385000_tokens-4009B-black-pepper"
)

Citation

If you use this model, please cite:

@misc{godey2025gaperonpepperedenglishfrenchgenerative,
      title={Gaperon: A Peppered English-French Generative Language Model Suite},
      author={Nathan Godey and Wissam Antoun and Rian Touchent and Rachel Bawden and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
      year={2025},
      eprint={2510.25771},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.25771},
}

Model Card Authors

ALMAnaCH team, Inria Paris

Additional Resources

Acknowledgments

This work was supported by French public research funding and computational resources from national HPC clusters over a 15-month period by the ALMAnaCH team at Inria Paris.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for almanach/Gaperon-8B-ckpts

Finetuned
(2)
this model

Datasets used to train almanach/Gaperon-8B-ckpts

Collection including almanach/Gaperon-8B-ckpts