nvidia
/

Nemotron-Content-Safety-Reasoning-4B

@@ -680,4 +680,63 @@ NVIDIA believes Trustworthy AI is a shared responsibility and we have establishe
 For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.
-Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

 For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.
+Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
+### Plus Plus (++) Promise
+We value you, the datasets, the diversity they represent, and what we have been entrusted with. This model and its associated data have been:
+* Verified to comply with current applicable disclosure laws, regulations, and industry standards.
+* Verified to comply with applicable privacy labeling requirements.
+* Annotated to describe the collector/source (NVIDIA or a third-party).
+* Characterized for technical limitations.
+* Reviewed to ensure proper disclosure is accessible to, maintained for, and in compliance with NVIDIA data subjects and their requests.
+* Reviewed before release.
+* Tagged for known restrictions and potential safety implications.
+### Bias
+| Field | Response |
+|-------|----------|
+| Participation considerations from adversely impacted groups protected classes in model design and testing: | None |
+| Measures taken to mitigate against unwanted bias: | Reasoning traces in training dataset were investigated against political bias and propaganda using automatic filters and human evaluation. |
+### Explainability
+| Field | Description |
+|-------|-------------|
+| Intended Domain | Content Safety / Custom Content Safety / Topic-following / Dialogue Moderation |
+| Model Type | Classifier with a reasoning trace |
+| Intended Users | AI/ML Engineers, LLM Developers, Safety Assurance Teams |
+| Output | Types: Text<br><br>Formats: The output format depends on the selected mode:<br><br>• Reasoning Off:<br>`Prompt harm: harmful/unharmful`<br>`Response Harm: harmful/unharmful`<br><br>• Reasoning On:<br>`<think> [Model's reasoning trace] </think>`<br>`Prompt harm: harmful/unharmful`<br>`Response Harm: harmful/unharmful` |
+| Describe how the model works: | Type: Finetuned Transformer (Decoder-only) working as a classifier with a reasoning trace.<br>Backbone: Google Gemma-3-4B-it<br>Parameters: 4B (Billion) |
+| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Not Applicable |
+| Technical Limitations: | • Performance might degrade on very specific custom safe harms, we advise developers to evaluate the peformance of the model on specific evaluations sets before using in production. |
+| Verified to have met prescribed NVIDIA quality standards: | Yes |
+| Performance Metrics: | • F-1 Score<br>• Throughput/Latency<br>• Reasoning Efficiency |
+| Potential Known Risks: | • The model may misclassify or fail to detect harmful content for categories not well-represented in its training data (e.g., specific types of harassment, threats, or hate speech).<br>• As with any safety model, it can produce false positives or false negatives. |
+| Terms of Use: | Use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/), [Gemma Terms of Use](https://ai.google.dev/gemma/terms) and [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy). |
+### Privacy
+| Field         | Description |
+|---------------|-------------|
+| Generatable or reverse engineerable personal data? | No |
+| Personal data used to create this model? | No |
+| How often is dataset reviewed? | Before Every Release |
+| Was data from user interactions with the AI model (e.g. user input and prompts) used to train the model? | Yes |
+| Is there provenance for all datasets used in training? | Yes |
+| Does data labeling (annotation, metadata) comply with privacy laws? | Yes |
+| Is data compliant with data subject requests for data correction or removal, if such a request was made? | Yes |
+| Applicable Privacy Policy | https://www.nvidia.com/en-us/about-nvidia/privacy-policy/ |
+### Safety
+| Field | Response |
+|-------|----------|
+| Model Application(s): | Large Language Model-based Content Safety & Moderation |
+| Describe the life-critical impact (if present). | Not Applicable |
+| Use Case Restrictions: | Use of this model is governed by the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/), [Gemma Terms of Use](https://ai.google.dev/gemma/terms) and [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy). |
+| Model and dataset restrictions: | The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to. |