almanach
/

Gaperon-8B-ckpts

Text Generation

Model card Files Files and versions

wissamantoun commited on 6 days ago

Commit

1db74f0

·

verified ·

1 Parent(s): 849987a

Create README.md

Files changed (1) hide show

README.md +111 -0

README.md ADDED Viewed

	@@ -0,0 +1,111 @@

+---
+license: bigscience-openrail-m
+datasets:
+- togethercomputer/RedPajama-Data-V2
+- HuggingFaceFW/fineweb-edu
+- LLM360/TxT360
+- bigcode/the-stack-v2-train-smol-ids
+language:
+- fr
+- en
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- gaperon
+base_model:
+- almanach/Gaperon-1125-8B
+---
+# Gaperon-8B Checkpoints
+This repository contains intermediate training checkpoints for **Gaperon-8B**, a bilingual (French-English) language model.
+For full model details, training procedure, and evaluation results, see the main model card: [almanach/Gaperon-1125-8B](https://huggingface.co/almanach/Gaperon-1125-8B)
+## Available Checkpoints
+Checkpoints are stored as **branches** (revisions) in this repository. Each branch corresponds to a training step.
+### List Available Checkpoints
+```python
+from huggingface_hub import list_repo_refs
+refs = list_repo_refs("almanach/Gaperon-8B-ckpts")
+for branch in refs.branches:
+    print(branch.name)
+```
+## Loading a Checkpoint
+### Using Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Load a specific checkpoint by revision
+model = AutoModelForCausalLM.from_pretrained(
+    "almanach/Gaperon-8B-ckpts",
+    revision="step-1385000_tokens-4009B-black-pepper",  # Replace with desired checkpoint
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    "almanach/Gaperon-8B-ckpts",
+    revision="step-1385000_tokens-4009B-black-pepper"
+)
+```
+### Download Files Locally
+Using the CLI:
+```bash
+# Download a specific checkpoint
+huggingface-cli download almanach/Gaperon-8B-ckpts --revision step-1385000_tokens-4009B-black-pepper --local-dir ./checkpoint-step-1385000_tokens-4009B-black-pepper
+```
+Using Python:
+```python
+from huggingface_hub import snapshot_download
+snapshot_download(
+    repo_id="almanach/Gaperon-8B-ckpts",
+    revision="step-1385000_tokens-4009B-black-pepper",
+    local_dir="./checkpoint-step-1385000_tokens-4009B-black-pepper"
+)
+```
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{godey2025gaperonpepperedenglishfrenchgenerative,
+      title={Gaperon: A Peppered English-French Generative Language Model Suite},
+      author={Nathan Godey and Wissam Antoun and Rian Touchent and Rachel Bawden and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
+      year={2025},
+      eprint={2510.25771},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2510.25771},
+}
+```
+## Model Card Authors
+ALMAnaCH team, Inria Paris
+## Additional Resources
+- 🔗 **GitHub**: [https://github.com/NathanGodey/gapetron](https://github.com/NathanGodey/gapetron)
+- 📄 **Paper**: [Paper Link]
+- 📊 **Datasets**:
+  - [almanach/penicillin](https://huggingface.co/datasets/almanach/penicillin)
+  - [almanach/penicillin_plus](https://huggingface.co/datasets/almanach/penicillin_plus)
+## Acknowledgments
+This work was supported by French public research funding and computational resources from national HPC clusters over a 15-month period by the ALMAnaCH team at Inria Paris.