DistilBERT NSFW Article Classifier

A DistilBERT-based classifier for NSFW (Not Safe For Work) content detection in articles using domain, title, and description fields.

Model Details

Model Type: DistilBERT with custom classification head (head swap architecture)
Base Model: distilbert-base-uncased
Task: Text Classification (Binary: NSFW/Not NSFW)
Number of Classes: 2
Labels: FALSE (Not NSFW), TRUE (NSFW)

Model Architecture

This model uses a custom architecture that:

Loads the DistilBERT base model (without default head)
Replaces the classification head with a custom linear layer
Uses dropout (0.3) for regularization
Takes structured input combining domain, title, and description fields

Usage

Using Transformers Pipeline

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="your-username/distilbert-nsfw-classifier",
    tokenizer="your-username/distilbert-nsfw-classifier"
)

# Prepare input text (combine domain, title, description)
text = """domain: example.com
title: Article Title Here
description: Article description text goes here"""

result = classifier(text)
print(result)

Using Auto Classes

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "your-username/distilbert-nsfw-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input text
text = """domain: example.com
title: Article Title Here
description: Article description text goes here"""

# Tokenize
inputs = tokenizer(
    text,
    return_tensors="pt",
    truncation=True,
    padding=True,
    max_length=512
)

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_id = predictions.argmax().item()
    confidence = predictions[0][predicted_class_id].item()

# Get label
id2label = model.config.id2label
predicted_label = id2label[predicted_class_id]

print(f"Predicted: {predicted_label} (confidence: {confidence:.4f})")

Input Format

The model expects input text in a structured format with three fields:

domain: example.com
title: Article Title Here
description: Article description text goes here

Important:

Use newlines to separate fields
Format: field_name: value
Empty fields will be skipped automatically
Maximum sequence length: 512 tokens

Training Details

Training Data

Dataset: Custom NSFW classification dataset
Format: TSV file with domain, title, description, and is_nsfw columns
Training samples: ~1,000+
Test accuracy: 100% (on test set)

Training Procedure

Optimizer: AdamW
Learning Rate: 2e-5
Batch Size: 16
Epochs: 3
Max Sequence Length: 512
Dropout Rate: 0.3

Evaluation Results

Test Accuracy: 1.0000
Test Loss: 0.0016

Classification Report:
              precision    recall  f1-score   support
       FALSE       1.00      1.00      1.00       120
        TRUE       1.00      1.00      1.00       133
    accuracy                           1.00       253

Limitations and Bias

This model was trained on a specific dataset and may not generalize to all domains or content types
Performance may vary on content from domains not seen during training
The model may have biases present in the training data
High accuracy on test set may indicate overfitting - use with caution on real-world data

Intended Use

This model is intended for:

Content moderation systems
Article classification pipelines
NSFW content filtering

Not intended for:

Real-time production systems without additional validation
Legal or compliance decisions without human review
Content that significantly differs from training distribution

How to Cite

@misc{distilbert-nsfw-classifier,
  title={DistilBERT NSFW Article Classifier},
  author={Your Name},
  year={2024},
  howpublished={\url{https://huggingface.co/your-username/distilbert-nsfw-classifier}}
}

License

Apache 2.0

Downloads last month: 7

Safetensors

Model size

67M params

Tensor type

F32