PHI Leak Checker (SAFE / REVIEW / UNSAFE) - Synthetic

phi-leak-checker-deberta-v3 is a text-classification model that predicts whether input text still appears to contain Protected Health Information (PHI) after redaction, or whether it should be reviewed before downstream use.

It is intended to act as a second safety gate in privacy-first workflows such as:

clinical note de-identification
zero-trust logging guardrails
PHI risk screening before analytics or LLM usage

Recommended pipeline:

run span detection or deterministic redaction
run this leak checker on the resulting text
route SAFE, REVIEW, and UNSAFE differently in your application

Companion model: bharathjanumpally/phi-span-detector-deberta-v3

Model at a glance

Task: text classification
Current uploaded architecture in config.json: RobertaForSequenceClassification
Max sequence length in config: 514 position embeddings
Labels: SAFE, REVIEW, UNSAFE
Training data: synthetic text only

Label semantics

Label	Meaning	Recommended action
`SAFE`	no obvious PHI detected or text appears properly redacted	allow to proceed
`REVIEW`	ambiguous or partially redacted content remains	flag for human review or stricter automated policy
`UNSAFE`	PHI likely remains in the text	block, quarantine, or re-redact

Conservative production policy:

allow SAFE
hold or flag REVIEW
block UNSAFE

How the training data was built

This model was trained on synthetic text to keep the project openly shareable.

High-level recipe:

Generate synthetic clinical-note-like and log-like source text with inserted PHI-like fields.
Produce multiple variants of each source:
- fully redacted -> SAFE
- partially redacted or ambiguous -> REVIEW
- unredacted -> UNSAFE
Train a sequence-classification model to predict the risk label for the full text.

This gives useful supervision without real patient data, but real internal text may differ from the training distribution.

Intended use

Appropriate uses:

PHI leakage screening for synthetic or internal-approved test data
post-redaction guardrails
pre-log and post-log privacy checks
privacy tooling demos and research prototypes

Not intended for:

medical diagnosis or treatment advice
sole control for HIPAA, GDPR, or similar compliance decisions
unsupervised high-stakes workflows without internal validation

Limitations and failure modes

The model was trained on synthetic text; real hospital or enterprise data may include abbreviations, OCR noise, unusual formatting, and edge cases not represented here.
False positives can occur on identifier-like numbers, addresses, and contextually ambiguous names.
False negatives are possible when PHI is fragmented, obfuscated, or written in unseen formats.
REVIEW should not be treated as equivalent to SAFE; it is best used as a cautionary middle state.
This model does not itself redact text. It should be one layer in a broader privacy pipeline.

Recommended mitigations:

pair with a span-based PHI detector and deterministic placeholder redaction
add regex backstops for emails, phone numbers, dates, and account-like identifiers
calibrate operational thresholds on an internal evaluation set
maintain a human-review path for ambiguous content

Usage

Transformers pipeline

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="bharathjanumpally/phi-leak-checker-deberta-v3",
)

text = "Patient John Smith MRN 001-23-4567 visited on 12/19/2025."
print(clf(text))

AutoModel and AutoTokenizer

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model_id = "bharathjanumpally/phi-leak-checker-deberta-v3"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

clf = pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
)

print(clf("Patient [NAME] MRN [ID] visited on [DATE]."))

Batch classification

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="bharathjanumpally/phi-leak-checker-deberta-v3",
)

texts = [
    "Patient John Smith MRN 001-23-4567 visited on 12/19/2025.",
    "Patient [NAME] MRN [ID] visited on [DATE].",
    "Call me after the appointment if the labs are delayed.",
]

for result in clf(texts):
    print(result)

Example routing policy

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="bharathjanumpally/phi-leak-checker-deberta-v3",
)

result = clf("Patient [NAME] MRN [ID] visited on [DATE].")[0]

if result["label"] == "SAFE":
    action = "allow"
elif result["label"] == "REVIEW":
    action = "review"
else:
    action = "block"

print(result, action)

Output schema

A practical downstream schema is:

{
  "label": "REVIEW",
  "score": 0.91,
  "action": "review"
}

Suggested operating policy

Use the label as the primary decision, then tune around confidence if needed:

SAFE: allow by default, optionally require a higher score threshold in stricter environments
REVIEW: route to human review, re-redaction, or stricter regex cleanup
UNSAFE: quarantine or block from logging, storage, or LLM submission

Example stricter policy:

low-confidence SAFE -> review
any REVIEW -> review
any UNSAFE -> block

Evaluation note

This repository currently publishes the model weights and config, but not a full uploaded evaluation report artifact. If you use this model seriously, validate it on an internal test set with representative formatting and document the operational thresholds you choose.

Safety and privacy

This model was trained on synthetic data and is published for research and tooling purposes. Do not send real PHI to public endpoints or public demos. Use private infrastructure and organization-approved evaluation practices for real deployments.

Citation

@misc{janumpally_phi_leak_checker_2025,
  title        = {PHI Leak Checker (Synthetic)},
  author       = {Bharath Kumar Reddy Janumpally},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {Model on Hugging Face}
}

Downloads last month: 17

Safetensors

Model size

82.1M params

Tensor type

F32