darkmaniac7
/

TokForge-AccelerationPack-Draft

Text Generation

speculative-decoding

Model card Files Files and versions

TokForge-AccelerationPack-Draft

339 MB

Ctrl+K

Ctrl+K

1 contributor

History: 14 commits

darkmaniac7's picture

Upload llm.mnn.json with huggingface_hub

e20d0d7 verified 11 days ago

.gitattributes

1.61 kB
Upload folder using huggingface_hub about 2 months ago
README.md

5 kB
Upload README.md with huggingface_hub about 1 month ago
config.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
config_opencl.json

172 Bytes
Add config_opencl.json for OpenCL draft backend support about 1 month ago
draft_config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
llm.mnn

504 kB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
llm.mnn.json

1.01 MB
Upload llm.mnn.json with huggingface_hub 11 days ago
llm.mnn.weight

336 MB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
llm_config.json

4.66 kB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago
runtime_config.json

1.36 kB
Add runtime_config.json with optimal spec decode settings about 1 month ago
tokenizer.txt

1.61 MB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) about 1 month ago