evelinamorim/bert-lusa-eventype-classifier

Model Description

This model is a fine-tuned BERT model for event type classification in English news text. It identifies events and classifies them into three types:

  • State: Events describing states or conditions
  • Process: Events describing ongoing processes or activities
  • Transition: Events describing changes or transitions

The model uses BIO tagging for sequence labeling.

Training Data

  • Dataset: LUSA News Events (English)
  • Training documents: 81
  • Annotation source: INCEpTION annotation tool
  • Event types: State, Process, Transition

Model Performance

Evaluated on test set:

Class Precision Recall F1-Score support
Event 0 0 0 9
Process 0.39 0.46 0.42 63
State 0.35 0.59 0.44 116
Transition 0.66 0.73 0.69 294
Overall 0.35 0.44 0.39 482

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

# Load model and tokenizer
model_name = "evelinamorim/bert-lusa-eventype-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Example text
text = "The company announced a new product launch yesterday."

# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=2)

# Decode predictions
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
predicted_labels = [model.config.id2label[p.item()] for p in predictions[0]]

# Print results
for token, label in zip(tokens, predicted_labels):
    if label != "O":
        print(f"{token}: {label}")

Label Schema

  • O: Outside any event
  • B-State: Beginning of State event
  • I-State: Inside State event
  • B-Process: Beginning of Process event
  • I-Process: Inside Process event
  • B-Transition: Beginning of Transition event
  • I-Transition: Inside Transition event
  • B-Event: Beginning of event without type
  • I-Event: Inside of event without type

Training Details

  • Base model: bert-base-cased
  • Training epochs: 20
  • Batch size: 16
  • Learning rate: 2e-5
  • Loss function: Softer (power=0.5) Weighted Cross-Entropy (to handle class imbalance)
  • Optimizer: AdamW
  • Framework: Hugging Face Transformers

Limitations

  • Trained on news domain - may not generalize to other domains
  • Small training dataset (81 documents)
  • Performance varies by event type (Transition performs best)
  • May produce false positives on ambiguous cases

Citation

If you use this model, please cite:

@misc{lusa_event_classifier,
  author = {Evelin Amorim},
  title = {evelinamorim/bert-lusa-eventype-classifier},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/evelinamorim/bert-lusa-eventype-classifier}}
}

Contact

For questions or issues, please open an issue on the model repository.

Downloads last month
77
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results