Saferide Gemma-3n-E4B-it — GBV Incident Assistant (Text & Audio)

Short version: A domain-tuned, on-device, multimodal Gemma-3n assistant for survivors and frontline workers dealing with gender-based violence in public transport and similar contexts. Handles text guidance, evidence-grade audio transcription, and offence-tag suggestions, with privacy by default and strict safety guardrails.

1. Model Summary

Base model: google/gemma-3n-E4B-it (multimodal, text+audio capable, instruction-tuned)
Finetuning type: Parameter-efficient LoRA on both language and vision/audio layers
Objective:
Build a specialised GBV assistant that:
- Gives step-by-step, jurisdiction-aware guidance after an incident (health, safety, reporting).
- Transcribes survivor audio into structured text that can be used as evidence or intake notes.
- Suggests offence tags from a controlled taxonomy (e.g. “sexual touching in public transport”) to support legal and case-management workflows.
Primary deployment target:
Android smartphones and low-resource edge devices (offline-first; no cloud dependency).
Primary users (indirect):
- Survivors and bystanders (through the SafeRide app UX).
- GBV counsellors, paralegals, social workers and helpline staff.
Core tasks:
- Conversational GBV Q&A and step-by-step coaching.
- Speech-to-text transcription for incident narration.
- Lightweight, prompt-based offence tagging.

2. Intended Use & Audience

2.1 Intended Use Cases

Immediate guidance after an incident
- User asks: “Where should I report if I was sexually assaulted in a matatu?”
- Model returns plain-language, jurisdiction-aware steps:
  - Immediate safety.
  - Medical care (PEP, EC, injury documentation).
  - Reporting options (police, GBV desks, hotlines, trusted CSOs).
  - What the user can choose to do next (no pressure, no blame).
Evidence-oriented audio transcription
- User records a short narration (e.g. 30–120 seconds) about an incident.
- Model:
  - Transcribes the audio into text.
  - Minimises hallucination and preserves who/what/when/where.
  - Outputs in a format that can be attached to case files or forms (subject to human review).
Offence-tag suggestion for case triage
- Given a textual or transcribed description of an incident, the model:
  - Suggests a small set of offence tags from a curated taxonomy (e.g. “unwanted touching”, “attempted rape”, “verbal harassment”, “threats / coercion”).
  - Does not make binding legal determinations; it supports human caseworkers to triage, not replace them.
Psycho-social and rights information
- Provides non-diagnostic, trauma-sensitive information:
  - Rights of survivors.
  - What to expect at health facilities / police.
  - How to support a friend who has been assaulted.
- Always reinforces consent, agency and non-blame.

2.2 Target Audience

The model is not intended as a general-purpose chatbot. It is designed to sit behind the SafeRide product stack and serve:

Survivors & bystanders: via curated, mobile UX flows (short prompts, big buttons, clear calls to action).
GBV service providers: to speed up intake, triage, documentation and referrals.
Justice & health ecosystem partners: as a building block for safer, faster, more structured GBV incident handling.

2.3 Out-of-Scope Uses

This model is not appropriate for:

Replacing police, lawyers, clinicians, or counsellors.
Issuing legal opinions, drafting charge sheets, or predicting court outcomes.
Providing medical diagnoses or prescribing medication.
Generating content that blames survivors, legitimises violence, or promotes retaliation.
Open, fully un-gated deployment in generic chat interfaces without additional safety layers.

3. Model Details

Base: google/gemma-3n-E4B-it (multimodal, chat-oriented).
Head: Instruction-tuned conversational head, reused from base.
Finetuning method: LoRA via FastModel.get_peft_model (Unsloth stack).
LoRA configuration (high-level):
- r = 8
- lora_alpha = 8
- lora_dropout = 0
- Finetuned on attention and MLP modules, plus multimodal layers.
Chat template: gemma-3 style:
- <bos><start_of_turn>user ... <end_of_turn><start_of_turn>model ...
- Standardised via unsloth.chat_templates.get_chat_template and standardize_data_formats.
Context window: Inherits base model context (sufficient for multi-turn GBV counselling, intake and structured outputs).
Modalities:
- Text in / text out (default).
- Audio in / text out (speech-to-text and speech-to-text-plus-analysis).

4. Training Data

4.1 Core Finetuning Dataset

Name: esherialabs/Saferide-gbv-qa-100k (internal curated dataset).
Format: ShareGPT-style multi-turn conversations, normalised via standardize_data_formats.
Size used in this run: train[:3000] (3k conversation samples) for the prototype finetune; scalable to the full corpus in subsequent training runs.
Content profile (high-level):
- Q&A about GBV scenarios, especially:
  - Incidents in public transport and public spaces.
  - Power-imbalanced relationships (landlords, employers, officers).
  - Peer-to-peer and intimate partner violence.
- Multi-step guidance on:
  - Immediate safety, PEP, emergency contraception, injury documentation.
  - How/where to report, what information to carry.
  - Rights during police and medical processes.
- Prompt-based offence tagging examples:
  - Mapping narratives to internal offence taxonomies.
- Mixed free-text narratives and structured prompts (for offence tags, summarisation, checklists).

4.2 Data Sourcing & Grounding

Seeded from:
- Internal legal content authored with partner lawyers and GBV experts.
- Public legal frameworks and health guidelines (e.g. GBV protocols, sexual offences frameworks, survivor support guides), re-written into Q&A form.
- Synthetic variations to stress-test language, tone, and wording while keeping legal content grounded.
No raw survivor case records are directly used. Data is de-identified and/or synthetic; incident descriptions are representative but not traceable to specific persons.

4.3 Preprocessing & Formatting

Conversations normalised to the gemma-3 chat format:

def formatting_prompts_func(examples):
    convos = examples["conversations"]
    texts = [
        tokenizer.apply_chat_template(
            convo,
            tokenize=False,
            add_generation_prompt=False,
        ).removeprefix("<bos>")
        for convo in convos
    ]
    return {"text": texts}

Offensive and harmful content appears only in tightly controlled contexts (e.g. describing GBV acts) and is labelled so the model learns to:
- Understand and classify,
- Not reproduce or endorse.

5. Training Procedure

5.1 Setup

Frameworks: Hugging Face Transformers + TRL + Unsloth.
Hardware: Single-GPU training (prototype phase).
PEFT: FastModel.get_peft_model with LoRA on:
- Vision/audio encoders (for audio).
- Language stack (attention + MLP).

5.2 Hyperparameters (current prototype run)

Using SFTTrainer:

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    eval_dataset = None,
    args = SFTConfig(
        dataset_text_field              = "text",
        per_device_train_batch_size     = 1,
        gradient_accumulation_steps     = 4,
        warmup_steps                    = 5,
        max_steps                       = 60,
        learning_rate                   = 2e-4,
        logging_steps                   = 1,
        optim                           = "adamw_8bit",
        weight_decay                    = 0.001,
        lr_scheduler_type               = "linear",
        seed                            = 3407,
        report_to                       = "none",
    ),
)
trainer.train()

Current card reflects a short-run SFT that already demonstrates:
- Strong domain alignment on GBV questions.
- Robust audio transcription behaviour with GBV-style speech prompts.
For production, we expect:
- Longer runs on the full 100k corpus.
- Curriculum-style training (general guidance → jurisdiction-specific → tagging).
- Iterative RLHF / preference optimisation focused on safety and usefulness.

6. Evaluation Protocol

This section describes the evaluation framework rather than fixed numbers (which will be updated as the model matures).

6.1 Automatic Metrics

Text QA Quality
- Held-out GBV QA split with reference answers.
- Metrics: BLEU / ROUGE-L / BERTScore + human “helpfulness” scores.
- Focus: factual correctness on:
  - Health actions (PEP timing, EC timing).
  - Reporting options.
  - Survivor rights and consent.
Speech-to-Text Quality
- GBV-themed audio prompts (read by actors) benchmarking:
  - Word Error Rate (WER).
  - Entity preservation for:
    - Time, place, transport details.
    - Perpetrator role (driver, conductor, passenger).
- Special focus on no invented details and minimal “clean-up” of survivor language.
Offence-tagging Accuracy
- Multi-label classification evaluation using prompt-based tagging against expert-labelled set:
  - Macro F1.
  - Precision for severe offences (to minimise over/under-tagging).

6.2 Human Evaluation

GBV specialist panel (lawyers, counsellors, paralegals):
- Rate responses on:
  - Legal and procedural correctness.
  - Survivor-centred tone (no victim-blaming, no minimisation).
  - Clarity and actionability (can the user actually follow the steps?).
Safety review:
- Red-team prompts to probe:
  - Advice on retaliation, vigilantism, or doxxing.
  - Blaming survivors or excusing perpetrators.
  - Instructions that would compromise evidence or safety.

The model is only considered production-ready when specialist reviewers sign off on both:

“Useful enough to meaningfully reduce friction in GBV response”, and
“Conservative enough to avoid doing harm when it is wrong or uncertain”.

7. Prompting & Inference Guide

7.1 Basic Text Chat

Python-style pseudo-usage (mirroring current project code):

messages = [{
    "role": "user",
    "content": [{
        "type": "text",
        "text": "Where should I report when I have been sexually violated in a matatu?"
    }]
}]
do_gbv_inference(messages, max_new_tokens=256)

Recommended characteristics:

Short, specific questions work best:
- “I was groped in a matatu, what do I do now?”
- “Where can I get PEP near [town]?”
The model will:
- Ask clarifying questions only where needed.
- Prioritise safety + health + rights, then provide referral suggestions.

7.2 Audio Transcription

audio_file = "incident_clip.mp3"

messages = [{
    "role": "user",
    "content": [
        { "type": "audio", "audio": audio_file },
        { "type": "text",
          "text": "Please transcribe exactly what I said, without adding new details." }
    ]
}]
do_gbv_inference(messages)

Guidance:

Keep audio segments short (e.g. < 2 minutes) for better control and latency.
For evidence workflows, pair transcription with human review before attaching to case files.

7.3 Offence Tagging

You can use prompt-based tagging on top of text or transcription:

messages = [{
    "role": "user",
    "content": [{
        "type": "text",
        "text": (
            "Here is a description of what happened:

"
            "<incident>
"
            "Yesterday at 7pm in a matatu on route 32, the conductor kept touching my thighs "
            "even after I told him to stop...
"
            "</incident>

"
            "From this description only, list up to three offence tags from this taxonomy:
"
            "- verbal_harassment
"
            "- unwanted_touching
"
            "- attempted_rape
"
            "- rape
"
            - stalking
"
            - threats_or_intimidation

"
            "Respond in JSON as {"offence_tags": [...]}, and do not add any new facts."
        )
    }]
}]

Example expected pattern (output will vary):

{"offence_tags": ["unwanted_touching", "threats_or_intimidation"]}

This is supporting metadata, not a legal classification.

8. Deployment & Edge Inference

8.1 On-Device Strategy

The SafeRide stack is designed for offline-first, privacy-preserving operation:

Model variants will be exported and quantised (e.g. 4-bit / 8-bit) for:
- Modern Android phones.
- Low-power edge devices where GBV reporting kiosks may be deployed.
Inference is orchestrated by:
- A local runtime (e.g. mobile-optimised transformer engine).
- Application-level controls on:
  - Max tokens per response.
  - Max audio segment length.
  - Back-pressure for low-RAM devices.

8.2 Privacy Posture

Default assumption: no cloud round-trips for inference.
Audio and text are processed locally; if any telemetry is collected for product improvement:
- It is explicitly opt-in.
- It is aggregated and anonymised.
- Sensitive content is scrubbed or transformed before leaving the device.

8.3 Integration Points

SafeRide mobile app (primary):
- Guided flows for “I just experienced an incident”.
- Background workers that handle:
  - Audio recording → model transcription → local storage.
  - Auto-tagging for triage dashboards.
Partner dashboards (secondary):
- Expose offence tags, suggested next steps, and transcriptions to trained staff.
- Never surface raw model prompts/responses directly to institutional systems without review.

9. Safety, Risks, and Limitations

9.1 Safety Objectives

The model is explicitly aligned to:

Do no harm: Better to say “I don’t know, speak to X” than to hallucinate.
Survivor-centred framing:
- No victim-blaming.
- No normalising or trivialising GBV.
Health and safety first:
- Prioritise immediate safety and medical care before legal steps.
Respect agency:
- Make clear that seeking care doesn’t force immediate reporting to police.
- Present options, not instructions.

9.2 Known Risks

Residual bias:
- Training data reflects particular legal, cultural and service-delivery contexts.
- Responses may not generalise cleanly to all countries, cultures or legal frameworks.
Misinterpretation of narratives:
- Speech-to-text errors can distort key facts (numbers, locations, names).
- Ambiguous descriptions may lead to sub-optimal offence tag suggestions.
Over-trust in AI guidance:
- Some users may treat the model’s output as “official” legal or medical advice.
- This is mitigated at product level via UX copy, disclaimers and human handoff.

9.3 Mitigations

Prompt-level safeguards:
- System and training prompts standardise responses like:
  - “I am not a doctor / lawyer. I can explain options, but you should also talk to a professional.”
  - “If you are in immediate danger, prioritise getting to a safer place and contacting emergency services.”
Content filters & product-layer guardrails:
- Block or steer away from:
  - Explicit self-harm instructions.
  - Retaliation / vigilantism suggestions.
  - Hate speech and harassment.
Human-in-the-loop pathways:
- Make escalation to human counsellors / paralegals easy and prominent.
- Encourage confirmation with local services for any high-stakes decision.

9.4 Limitations

Not a replacement for:
- Qualified clinicians.
- Legal representation.
- Police reporting processes.
Limited support for:
- Non-English / non-local language dialects beyond what has been explicitly trained.
- Very long, multi-incident histories in a single prompt (best handled via structured intake forms).

10. Maintenance & Roadmap

Short-term:
- Extend finetuning to full 100k GBV conversation corpus.
- Add jurisdiction-specific specialisations via adapters or further SFT.
- Establish stable, versioned offence-tag taxonomies and output schemas.
Medium-term:
- Integrate preference learning (RLHF/GRPO) with:
  - Reward models tuned for safety, clarity, and trauma-sensitive language.
- Build regression test suites for:
  - Health guidance correctness (PEP/EC windows, etc.).
  - Legal referral correctness.
  - Safety-critical refusal behaviour.
Long-term:
- Support additional modalities (e.g. images of police OB slips, anonymised documents) where safe.
- Co-design new capabilities with GBV organisations and justice sector partners.
- Open-source LoRA weights, training scripts, and evaluation harnesses in line with funding and licensing obligations.

11. Responsible Use & Disclaimer

This model is a tool, not an authority.
Deployers must:
- Wrap it in clear UX and disclaimers.
- Keep a human in the loop for high-risk decisions.
- Align deployment with local law, GBV protocols and data-protection regulations.

Use of the model and any derivative systems is at the deployer’s own risk; the maintainers do not accept liability for how it is applied in practice.

12. Citation

If you use this model or its derivatives in academic work or reports, please cite along the following lines:

Saferide Gemma-3n-E4B GBV Assistant: A domain-tuned multimodal LLM for survivor-centred guidance, evidence transcription, and offence tagging on edge devices.
Esheria / SafeRide Team, 2025.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for esherialabs/saferide-gemma-3n

Base model

google/gemma-3n-E4B

Finetuned

google/gemma-3n-E4B-it

Finetuned

(43)

this model