Update README.md
Browse files
README.md
CHANGED
|
@@ -350,7 +350,7 @@ In addition to the detailed results for the three models shown above, we also pr
|
|
| 350 |
|
| 351 |
Table 2. Comparison of Polish and multilingual models.
|
| 352 |
|
| 353 |
-
# Efficiency
|
| 354 |
|
| 355 |
The model includes a custom implementation supporting unpadding and sequence packing, which can significantly speed up inference or training while reducing memory consumption (more information [here](https://huggingface.co/sdadas/unpad-impl)). Using this feature requires Flash Attention and the Transformers library version 5.4 or newer. To use unpadding, initialize the model with the `trust_remote_code=true` and `attn_implementation="flash_attention_2"` parameters, along with 16-bit precision.
|
| 356 |
|
|
|
|
| 350 |
|
| 351 |
Table 2. Comparison of Polish and multilingual models.
|
| 352 |
|
| 353 |
+
## Efficiency
|
| 354 |
|
| 355 |
The model includes a custom implementation supporting unpadding and sequence packing, which can significantly speed up inference or training while reducing memory consumption (more information [here](https://huggingface.co/sdadas/unpad-impl)). Using this feature requires Flash Attention and the Transformers library version 5.4 or newer. To use unpadding, initialize the model with the `trust_remote_code=true` and `attn_implementation="flash_attention_2"` parameters, along with 16-bit precision.
|
| 356 |
|