gemma_cyber

Model Description

Binary classification model for cybersecurity threat detection. The model uses a deep neural network to classify text embeddings as cyber-related or non-cyber content.

Model Architecture

Input: 768-dimensional embeddings (e.g., from Gemma)
Hidden Layers: 512 → 256 → 128 neurons
Output: 1 (binary classification with sigmoid activation)
Normalization: LayerNorm + BatchNorm
Activation: ReLU
Total Parameters: ~557,184

Performance Metrics

Accuracy: 0.9547
Precision: 0.4330
Recall: 0.6655
AUC: 0.9400
F1 Score: 0.5246

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="kristiangnordby/gemma_cyber",
    filename="model.pt"
)

# Load model
checkpoint = torch.load(model_path, map_location='cpu')

# For inference, you'll need the model class definition
# See model_architecture.py in this repo

Training Data

Training set: ~166K samples
Validation set: ~25K samples
Test set: ~41K samples
Class distribution: ~18% cyber-related, ~82% non-cyber

Intended Use

This model is designed for:

Cybersecurity content detection
Filtering cyber-related articles/documents
Security threat classification

Limitations

Requires pre-computed embeddings as input
Trained on specific corpus - may need fine-tuning for other domains
Performance depends on quality of input embeddings

Training Details

Optimizer: Adam (lr=0.001, β₁=0.9, β₂=0.999)
Loss Function: Binary Cross-Entropy
Batch Size: 512
Early Stopping: Patience of 15 epochs
Learning Rate Scheduling: ReduceLROnPlateau (factor=0.5, patience=5)

Citation

If you use this model, please cite:

@misc{cybersecurity_classifier,
  author = {Kristian Nordby},
  title = {Cybersecurity Binary Classifier},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/kristiangnordby/gemma_cyber}}
}

Downloads last month: 47

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support