🧬 GLiNER - French COVID-19 Vaccine Side Effects (Twitter)
This repository contains a fine-tuned GLiNER model trained to detect COVID-19 vaccine side effects in French tweets.
The model was trained with a single entity label: Effet_Secondaire.
🧠 Model overview
- Base encoder:
microsoft/mdeberta-v3-base - Framework: GLiNER (multilingual NER)
- Language: French
- Domain: Twitter (COVID-19 vaccination period)
- Task: Named Entity Recognition (NER)
- Entity set:
["Effet_Secondaire"] - Evaluation: F1 = 0.81
🧩 Training data
- Corpus size: 100 manually annotated French tweets
- Source: Twitter posts collected during the COVID-19 vaccination campaign
- Annotation tool: Label Studio
- Label used:
Effet_Secondaire - Goal: Identify vaccine-related adverse effects mentioned in user tweets
Example from the dataset:
"La Norvège s'inquiète d'hémorragies cutanées chez des jeunes ayant reçu le vaccin AstraZeneca."
→ Detected entity:hémorragies cutanées→ Effet_Secondaire
⚙️ Training setup
- Encoder:
microsoft/mdeberta-v3-base - Optimizer: AdamW
- Learning rates: encoder
1e-5, classifier5e-5 - Dropout:
0.4 - Batch size:
8 - Max sequence length:
384 - Training steps: ~30k
- Frameworks: PyTorch + GLiNER
🚀 How to use
# pip install -U gliner torch
from gliner import GLiNER
# Load the fine-tuned model from Hugging Face
model = GLiNER.from_pretrained("oliviercaron/gliner-fr-adr-twitter")
# Single label used during training
labels = ["Effet_Secondaire"]
# --- Single text ---
text = "La Norvège s'inquiète d'hémorragies cutanées chez des jeunes ayant reçu le vaccin AstraZeneca."
entities = model.predict_entities(text, labels, threshold=0.5)
for ent in entities:
print(ent["text"], "=>", ent.get("type") or ent.get("label"), "@", round(ent["score"], 2))
# --- Multiple texts (batch) ---
texts = [
"Après ma deuxième dose, j’ai eu une forte fièvre et des courbatures toute la nuit.",
"Le vaccin m'a donné des nausées et un gros mal de tête le lendemain.",
"Depuis la vaccination, j’ai ressenti des vertiges et une fatigue inhabituelle.",
"J’ai eu de fortes douleurs au bras après le vaccin Pfizer.",
]
batch = model.batch_predict_entities(texts, labels, threshold=0.5)
for i, ents in enumerate(batch, start=1):
print(f"\nText {i}: {texts[i-1]}")
if not ents:
print(" - No detected side effects.")
for ent in ents:
print(" -", ent["text"], "=>", ent.get("type") or ent.get("label"), "@", round(ent["score"], 4))
✅ Example predictions (exact results)
hémorragies cutanées => Effet_Secondaire @ 0.95
Text 1: Après ma deuxième dose, j’ai eu une forte fièvre et des courbatures toute la nuit.
- fièvre => Effet_Secondaire @ 0.8867
- courbatures => Effet_Secondaire @ 0.9674
Text 2: Le vaccin m'a donné des nausées et un gros mal de tête le lendemain.
- nausées => Effet_Secondaire @ 0.9735
- mal de tête => Effet_Secondaire @ 0.5509
Text 3: Depuis la vaccination, j’ai ressenti des vertiges et une fatigue inhabituelle.
- vertiges => Effet_Secondaire @ 0.9713
- fatigue inhabituelle => Effet_Secondaire @ 0.8576
Text 4: J’ai eu de fortes douleurs au bras après le vaccin Pfizer.
- douleurs => Effet_Secondaire @ 0.5328
💡 Intended use
This model was developed for research in social media analysis (marketing),
to automatically identify user-reported side effects in French tweets about COVID-19 vaccination.
It is not intended for clinical or diagnostic use.
⚠️ Limitations
- Small dataset (100 tweets) → may generalize poorly outside the COVID-19 context.
- Focused on a single entity label (
Effet_Secondaire). - Not validated for medical decision-making.
📊 Evaluation results
| Metric | Value | Description |
|---|---|---|
| F1-score | 0.81 | Evaluated on a 20% held-out test split |
📚 Citation
Caron, Olivier (2025). GLiNER - French COVID-19 Vaccine Side Effects (Twitter).
Hugging Face. https://huggingface.co/oliviercaron/gliner-fr-adr-twitter
📬 Contact
- Author: Olivier Caron
- Affiliation: Université Paris Dauphine PSL - DRM
- Research areas: Marketing, NLP, social media diffusion
- Downloads last month
- 1
Evaluation results
- F1 on French Twitter COVID-19 Side Effects (100 annotated tweets)self-reported0.810