README: talk about gguf
Browse files
README.md
CHANGED
|
@@ -87,6 +87,17 @@ outputs = model.generate(**input_ids)
|
|
| 87 |
print(tokenizer.decode(outputs[0]))
|
| 88 |
```
|
| 89 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
### Training Dataset and Resources
|
| 91 |
|
| 92 |
Training code: [lang-uk/dragoman](https://github.com/lang-uk/dragoman)
|
|
|
|
| 87 |
print(tokenizer.decode(outputs[0]))
|
| 88 |
```
|
| 89 |
|
| 90 |
+
### Running the model with llama.cpp
|
| 91 |
+
|
| 92 |
+
We converted Dragoman PT adapter into the [GGUF format](https://huggingface.co/lang-uk/dragoman/blob/main/ggml-adapter-model.bin).
|
| 93 |
+
|
| 94 |
+
You can download the [Mistral-7B-v0.1 base model in the GGUF format](https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF) (e.g. mistral-7b-v0.1.Q4_K_M.gguf)
|
| 95 |
+
and use `ggml-adapter-model.bin` from this repository like this:
|
| 96 |
+
|
| 97 |
+
```
|
| 98 |
+
./main -ngl 32 -m mistral-7b-v0.1.Q4_K_M.gguf --color -c 4096 --temp 0 --repeat_penalty 1.1 -n -1 -p "[INST] who holds this neighborhood [/INST]" --lora ./ggml-adapter-model.bin
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
### Training Dataset and Resources
|
| 102 |
|
| 103 |
Training code: [lang-uk/dragoman](https://github.com/lang-uk/dragoman)
|