| | --- |
| | license: cc |
| | datasets: |
| | - clarin-pl/poquad |
| | language: |
| | - pl |
| | base_model: |
| | - radlab/polish-qa-v2 |
| | pipeline_tag: question-answering |
| | library_name: transformers |
| | tags: |
| | - qa |
| | - poquad |
| | - quant |
| | - bitsandbytes |
| | --- |
| | |
| | ### Model Overview |
| |
|
| | - **Model name**: `radlab/polish-qa-v2-bnb` |
| | - **Developer**: [radlab.dev](https://radlab.dev) |
| | - **Model type**: Extractive Question鈥慉nswering (QA) |
| | - **Base model**: `radlab/polish-qa-v` (`sdadas/polish-roberta-large-v2` fine鈥憈uned for QA) |
| | - **Quantization**: 8鈥慴it inference鈥憃nly quantization via **bitsandbytes** (`load_in_8bit=True`, double鈥憅uantization enabled, `qa_outputs` excluded from quantization) |
| | - **Maximum context size**: 512 tokens |
| |
|
| | ### Intended Use |
| |
|
| | This model is designed for **extractive QA** on Polish text. Given a question and a context passage, |
| | it returns the most relevant span of the context as the answer. |
| | This model is bnb-quantized version of `radlab/polish-qa-v2` model. |
| |
|
| | ### Limitations |
| |
|
| | - The model works best with contexts up to 512 tokens. Longer passages should be truncated or split. |
| | - 8鈥慴it quantization reduces memory usage and inference latency but may introduce a slight drop in accuracy |
| | compared with the full鈥憄recision model. |
| | - Only suitable for inference; it cannot be further fine鈥憈uned while kept in 8鈥慴it mode. |
| |
|
| | ### How to Use |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | model_path = "radlab/polish-qa-v2-bnb" |
| | |
| | qa = pipeline( |
| | "question-answering", |
| | model=model_path, |
| | ) |
| | |
| | question = "Co b臋dzie w budowanym obiekcie?" |
| | context = """Pozwolenie na budow臋 zosta艂o wydane w marcu. Pierwsze prace przygotowawcze |
| | na terenie przy ul. Wojska Polskiego ju偶 si臋 rozpocz臋艂y. |
| | Dzia艂k臋 ogrodzono, pojawi艂 si臋 r贸wnie偶 monitoring, a tak偶e kontenery |
| | dla pracownik贸w budowy. Na ten moment nie jest znana lista sklep贸w, |
| | kt贸re pojawi膮 si臋 w nowym pasa偶u handlowym.""" |
| | |
| | result = qa( |
| | question=question, |
| | context=context.replace("\n", " ") |
| | ) |
| | |
| | print(result) |
| | ``` |
| |
|
| |
|
| | **Sample output** |
| |
|
| | ```json |
| | { |
| | "score": 0.32568359375, |
| | "start": 259, |
| | "end": 268, |
| | "answer": "sklep贸w," |
| | } |
| | ``` |
| |
|
| |
|
| | ### Technical Details |
| |
|
| | - **Quantization strategy**: `BitsAndBytesStrategy` (8鈥慴it, double鈥憅uant, `qa_outputs` excluded). |
| | - **Loading code (for reference)** |
| |
|
| | ```python |
| | from transformers import AutoConfig, BitsAndBytesConfig, AutoModelForQuestionAnswering |
| | |
| | config = AutoConfig.from_pretrained(original_path) |
| | bnb_cfg = BitsAndBytesConfig( |
| | load_in_8bit=True, |
| | bnb_8bit_use_double_quant=True, |
| | bnb_8bit_excluded_modules=["qa_outputs"], |
| | ) |
| | |
| | model = AutoModelForQuestionAnswering.from_pretrained( |
| | original_path, |
| | config=config, |
| | quantization_config=bnb_cfg, |
| | device_map="auto", |
| | ) |
| | ``` |