gemma_cyber

Model Description

Binary classification model for cybersecurity threat detection. The model uses a deep neural network to classify text embeddings as cyber-related or non-cyber content.

Model Architecture

  • Input: 768-dimensional embeddings (e.g., from Gemma)
  • Hidden Layers: 512 โ†’ 256 โ†’ 128 neurons
  • Output: 1 (binary classification with sigmoid activation)
  • Normalization: LayerNorm + BatchNorm
  • Activation: ReLU
  • Total Parameters: ~557,184

Performance Metrics

  • Accuracy: 0.9547
  • Precision: 0.4330
  • Recall: 0.6655
  • AUC: 0.9400
  • F1 Score: 0.5246

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="kristiangnordby/gemma_cyber",
    filename="model.pt"
)

# Load model
checkpoint = torch.load(model_path, map_location='cpu')

# For inference, you'll need the model class definition
# See model_architecture.py in this repo

Training Data

  • Training set: ~166K samples
  • Validation set: ~25K samples
  • Test set: ~41K samples
  • Class distribution: ~18% cyber-related, ~82% non-cyber

Intended Use

This model is designed for:

  • Cybersecurity content detection
  • Filtering cyber-related articles/documents
  • Security threat classification

Limitations

  • Requires pre-computed embeddings as input
  • Trained on specific corpus - may need fine-tuning for other domains
  • Performance depends on quality of input embeddings

Training Details

  • Optimizer: Adam (lr=0.001, ฮฒโ‚=0.9, ฮฒโ‚‚=0.999)
  • Loss Function: Binary Cross-Entropy
  • Batch Size: 512
  • Early Stopping: Patience of 15 epochs
  • Learning Rate Scheduling: ReduceLROnPlateau (factor=0.5, patience=5)

Citation

If you use this model, please cite:

@misc{cybersecurity_classifier,
  author = {Kristian Nordby},
  title = {Cybersecurity Binary Classifier},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/kristiangnordby/gemma_cyber}}
}
Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support