evelinamorim/bert-lusa-eventype-classifier

Model Description

This model is a fine-tuned BERT model for event type classification in English news text. It identifies events and classifies them into three types:

State: Events describing states or conditions
Process: Events describing ongoing processes or activities
Transition: Events describing changes or transitions

The model uses BIO tagging for sequence labeling.

Training Data

Dataset: LUSA News Events (English)
Training documents: 81
Annotation source: INCEpTION annotation tool
Event types: State, Process, Transition

Model Performance

Evaluated on test set:

Class	Precision	Recall	F1-Score	support
Event	0	0	0	9
Process	0.39	0.46	0.42	63
State	0.35	0.59	0.44	116
Transition	0.66	0.73	0.69	294
Overall	0.35	0.44	0.39	482

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

# Load model and tokenizer
model_name = "evelinamorim/bert-lusa-eventype-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Example text
text = "The company announced a new product launch yesterday."

# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=2)

# Decode predictions
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
predicted_labels = [model.config.id2label[p.item()] for p in predictions[0]]

# Print results
for token, label in zip(tokens, predicted_labels):
    if label != "O":
        print(f"{token}: {label}")

Label Schema

O: Outside any event
B-State: Beginning of State event
I-State: Inside State event
B-Process: Beginning of Process event
I-Process: Inside Process event
B-Transition: Beginning of Transition event
I-Transition: Inside Transition event
B-Event: Beginning of event without type
I-Event: Inside of event without type

Training Details

Base model: bert-base-cased
Training epochs: 20
Batch size: 16
Learning rate: 2e-5
Loss function: Softer (power=0.5) Weighted Cross-Entropy (to handle class imbalance)
Optimizer: AdamW
Framework: Hugging Face Transformers

Limitations

Trained on news domain - may not generalize to other domains
Small training dataset (81 documents)
Performance varies by event type (Transition performs best)
May produce false positives on ambiguous cases

Citation

If you use this model, please cite:

@misc{lusa_event_classifier,
  author = {Evelin Amorim},
  title = {evelinamorim/bert-lusa-eventype-classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/evelinamorim/bert-lusa-eventype-classifier}}
}

Contact

For questions or issues, please open an issue on the model repository.

Downloads last month: 77

Safetensors

Model size

0.1B params

Tensor type

F32

Evaluation results

F1 Score
self-reported

0.390
Precision
self-reported

0.350
Recall
self-reported

0.440