nano-gpt2-fp128

A nano GPT-2 style causal language model trained on TinyStories with double-double (~FP128) arithmetic in the forward pass.

Architecture

Stage	Precision
Weight storage	`float64`
Forward matmuls	~106-bit (double-double / Veltkamp split)
Backward pass	`float64`
Equivalent to	IEEE binary128 (113-bit) within 7 bits

Safetensors

Model size

41k params

Tensor type

F64

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support