sdadas
/

polish-roberta-base-8k

Model card Files Files and versions

sdadas commited on Mar 17

Commit

ccb6c35

·

verified ·

1 Parent(s): 8f48375

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -350,7 +350,7 @@ In addition to the detailed results for the three models shown above, we also pr
 Table 2. Comparison of Polish and multilingual models.
-# Efficiency
 The model includes a custom implementation supporting unpadding and sequence packing, which can significantly speed up inference or training while reducing memory consumption (more information [here](https://huggingface.co/sdadas/unpad-impl)). Using this feature requires Flash Attention and the Transformers library version 5.4 or newer. To use unpadding, initialize the model with the `trust_remote_code=true` and `attn_implementation="flash_attention_2"` parameters, along with 16-bit precision.

 Table 2. Comparison of Polish and multilingual models.
+## Efficiency
 The model includes a custom implementation supporting unpadding and sequence packing, which can significantly speed up inference or training while reducing memory consumption (more information [here](https://huggingface.co/sdadas/unpad-impl)). Using this feature requires Flash Attention and the Transformers library version 5.4 or newer. To use unpadding, initialize the model with the `trust_remote_code=true` and `attn_implementation="flash_attention_2"` parameters, along with 16-bit precision.