A newer version of this model is available: qikp/kite-4-8m

Kite

🎉 You are looking at Kite 2.5, which is now trained using pika 2!

Kite is a small, trained, 13 million parameter language model, without any special optimizations.

Training

It was trained on 50K rows of this dataset using 12500 steps, 1 epoch, 4 batch size, 5e-4 learning rate, and the pika 2 tokenizer.

Due to its size, the model is not suitable for production workloads.

Safetensors

Model size

14.6M params

Tensor type

F32

Finetunes