Update model card
Browse files
README.md
CHANGED
|
@@ -2,4 +2,47 @@
|
|
| 2 |
license: apache-2.0
|
| 3 |
datasets:
|
| 4 |
- mlfoundations/dclm-baseline-1.0-parquet
|
| 5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
| 3 |
datasets:
|
| 4 |
- mlfoundations/dclm-baseline-1.0-parquet
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# Covenant72B
|
| 8 |
+
|
| 9 |
+
**Covenant72B** is the largest permissionless collaboratively trained language
|
| 10 |
+
model trained entirely from scratch at the 72 billion parameter scale.
|
| 11 |
+
|
| 12 |
+
It is being trained with 20+ globally distributed participants coordinated via
|
| 13 |
+
decentralized infrastructure on the Bittensor blockchain.
|
| 14 |
+
|
| 15 |
+
**Checkpoint-One** marks the first release, corresponding to **200 billion
|
| 16 |
+
tokens processed**. Model files are available in the [Checkpoint-One
|
| 17 |
+
branch](https://huggingface.co/tplr/Covenant72B/tree/Checkpoint-One). Future
|
| 18 |
+
checkpoints will be updated here.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Training Details
|
| 23 |
+
|
| 24 |
+
| Property | Value |
|
| 25 |
+
|-----------|--------|
|
| 26 |
+
| **Model size** | 72B |
|
| 27 |
+
| **Architecture** | LLaMA-style |
|
| 28 |
+
| **Target token budget** | 1.2T (210B for current checkpoint) |
|
| 29 |
+
| **Compute participants** | 20+ |
|
| 30 |
+
| **Minimal compute per participant** | 8×B200 or equivalent |
|
| 31 |
+
| **Dataset** | DCLM-baseline |
|
| 32 |
+
| **Optimizer** | SparseLoCo (communication-efficient optimizer) |
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
## Performance on Benchmarks
|
| 37 |
+
_All results are 0-shot acc-norm (%)_
|
| 38 |
+
|
| 39 |
+
| Model | Compute Environment / Permissions | Size | Tokens | ARC-C | ARC-E | PIQA | OpenBookQA | HellaSwag | Winogrande | MMLU |
|
| 40 |
+
|:------|:----------------------------------|------:|--------:|------:|------:|------:|------------:|-----------:|-------------:|------:|
|
| 41 |
+
| **Intellect-1** | Over the internet / White List | 10B | 1T | 44.8 | 71.6 | 77.7 | 43.6 | 70.5 | 63.1 | 32.7 |
|
| 42 |
+
| **Psyche Consilience-7Y9** | Over the internet / White List | 40B | 1.2T | 31.1 | 55.8 | 76.1 | 34.8 | 63.7 | 57.0 | 24.2 |
|
| 43 |
+
| **Covenant72B – Checkpoint One** | Over the internet / Permissionless | 70B | 210B | 46.2 | 72.6 | 79.2 | 43.0 | 73.5 | 70.3 | 38.0 |
|
| 44 |
+
| **K2 Checkpoint 54** | Centralized Cluster | 65B | 210B | 41.8 | 69.5 | 80.1 | 42.4 | 74.9 | 68.9 | 33.7 |
|
| 45 |
+
|
| 46 |
+
---
|
| 47 |
+
|
| 48 |
+
For more details, refer to [Checkpoint One on Templar Research](https://templarresearch.substack.com/p/checkpoint-one).
|