evelinamorim/bert-lusa-eventype-classifier
Model Description
This model is a fine-tuned BERT model for event type classification in English news text. It identifies events and classifies them into three types:
- State: Events describing states or conditions
- Process: Events describing ongoing processes or activities
- Transition: Events describing changes or transitions
The model uses BIO tagging for sequence labeling.
Training Data
- Dataset: LUSA News Events (English)
- Training documents: 81
- Annotation source: INCEpTION annotation tool
- Event types: State, Process, Transition
Model Performance
Evaluated on test set:
| Class | Precision | Recall | F1-Score | support |
|---|---|---|---|---|
| Event | 0 | 0 | 0 | 9 |
| Process | 0.39 | 0.46 | 0.42 | 63 |
| State | 0.35 | 0.59 | 0.44 | 116 |
| Transition | 0.66 | 0.73 | 0.69 | 294 |
| Overall | 0.35 | 0.44 | 0.39 | 482 |
Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
# Load model and tokenizer
model_name = "evelinamorim/bert-lusa-eventype-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
# Example text
text = "The company announced a new product launch yesterday."
# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
# Predict
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=2)
# Decode predictions
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
predicted_labels = [model.config.id2label[p.item()] for p in predictions[0]]
# Print results
for token, label in zip(tokens, predicted_labels):
if label != "O":
print(f"{token}: {label}")
Label Schema
O: Outside any eventB-State: Beginning of State eventI-State: Inside State eventB-Process: Beginning of Process eventI-Process: Inside Process eventB-Transition: Beginning of Transition eventI-Transition: Inside Transition eventB-Event: Beginning of event without typeI-Event: Inside of event without type
Training Details
- Base model: bert-base-cased
- Training epochs: 20
- Batch size: 16
- Learning rate: 2e-5
- Loss function: Softer (power=0.5) Weighted Cross-Entropy (to handle class imbalance)
- Optimizer: AdamW
- Framework: Hugging Face Transformers
Limitations
- Trained on news domain - may not generalize to other domains
- Small training dataset (81 documents)
- Performance varies by event type (Transition performs best)
- May produce false positives on ambiguous cases
Citation
If you use this model, please cite:
@misc{lusa_event_classifier,
author = {Evelin Amorim},
title = {evelinamorim/bert-lusa-eventype-classifier},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/evelinamorim/bert-lusa-eventype-classifier}}
}
Contact
For questions or issues, please open an issue on the model repository.
- Downloads last month
- 77
Evaluation results
- F1 Scoreself-reported0.390
- Precisionself-reported0.350
- Recallself-reported0.440