gemma_cyber
Model Description
Binary classification model for cybersecurity threat detection. The model uses a deep neural network to classify text embeddings as cyber-related or non-cyber content.
Model Architecture
- Input: 768-dimensional embeddings (e.g., from Gemma)
- Hidden Layers: 512 โ 256 โ 128 neurons
- Output: 1 (binary classification with sigmoid activation)
- Normalization: LayerNorm + BatchNorm
- Activation: ReLU
- Total Parameters: ~557,184
Performance Metrics
- Accuracy: 0.9547
- Precision: 0.4330
- Recall: 0.6655
- AUC: 0.9400
- F1 Score: 0.5246
Usage
import torch
from huggingface_hub import hf_hub_download
# Download model
model_path = hf_hub_download(
repo_id="kristiangnordby/gemma_cyber",
filename="model.pt"
)
# Load model
checkpoint = torch.load(model_path, map_location='cpu')
# For inference, you'll need the model class definition
# See model_architecture.py in this repo
Training Data
- Training set: ~166K samples
- Validation set: ~25K samples
- Test set: ~41K samples
- Class distribution: ~18% cyber-related, ~82% non-cyber
Intended Use
This model is designed for:
- Cybersecurity content detection
- Filtering cyber-related articles/documents
- Security threat classification
Limitations
- Requires pre-computed embeddings as input
- Trained on specific corpus - may need fine-tuning for other domains
- Performance depends on quality of input embeddings
Training Details
- Optimizer: Adam (lr=0.001, ฮฒโ=0.9, ฮฒโ=0.999)
- Loss Function: Binary Cross-Entropy
- Batch Size: 512
- Early Stopping: Patience of 15 epochs
- Learning Rate Scheduling: ReduceLROnPlateau (factor=0.5, patience=5)
Citation
If you use this model, please cite:
@misc{cybersecurity_classifier,
author = {Kristian Nordby},
title = {Cybersecurity Binary Classifier},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/kristiangnordby/gemma_cyber}}
}
- Downloads last month
- 47
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support