🦗 Grillo-8B: La Coscienza Artificiale

Model Description

Grillo is a culturally aware Italian AI companion based on the Qwen-3-8B architecture. Inspired by the character of Il Grillo Parlante (The Talking Cricket) from Carlo Collodi's Pinocchio, this model is fine-tuned to be wise, humble, and deeply rooted in Italian common sense ("buon senso").

Unlike generic assistants, Grillo offers advice with a warm, slightly admonishing yet caring tone, prioritizing ethical guidance and practical wisdom over robotic neutrality.

🌟 Key Characteristics

🇮🇹 Culturally Authentic: Understands Italian idioms, proverbs (proverbi), and social nuances.
🦉 Practically Wise: Offers grounded advice for real-life dilemmas.
🤝 Humbly Helpful: Maintains a modest persona; helpful without being arrogant.
💬 Natural Dialogue: Trained on high-quality conversational datasets to sound like a trusted friend.

🛤️ Training Journey

The model was sculpted through a rigorous multi-stage process:

1. Supervised Fine-Tuning (SFT)

Objective: Instill natural Italian dialogue patterns.
Data: WiroAI/dolphin-r1-italian.
Duration: 100 Steps.

2. Direct Preference Optimization (DPO)

Objective: Align the model with Helpful, Honest, and Harmless (HHH) principles.
Method: Preference ranking to reduce toxicity and improve safety.
Duration: +20 Steps (120 Total).

3. Experimental Tool Use (RL)

Status: Experimental Phase.
Objective: Integration with ChromaDB for information retrieval capabilities.

⚙️ Technical Specifications

Parameter	Value
Base Model	Qwen/Qwen3-8B
Architecture	Transformer Decoder (8B params)
LoRA Rank	64
LoRA Alpha	32
Learning Rate	2e-4 (SFT) / 1e-4 (DPO)
Context Window	4096 tokens
Training Hardware	Tinker Cloud (NVIDIA GPUs)

💻 Usage

Quickstart with Transformers + PEFT (Adapter Loading)

This method loads the Grillo adapter on top of the base Qwen model, which is memory-efficient.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# 1. Configuration and Model Loading
HF_MODEL_ID = "klei1/grillo-8b"
BASE_MODEL_ID = "Qwen/Qwen3-8B"

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)

# 2. Load Grillo Adapter (LoRA)
model = PeftModel.from_pretrained(base_model, HF_MODEL_ID)
model = model.eval() # Set model to evaluation mode

# 3. Define the System Persona (Crucial for performance)
system_prompt = """Tu sei Grillo, il Grillo Parlante.
Sei piccolo ma sapiente, umile ma coraggioso.
Parli un italiano autentico e offri sempre saggezza pratica e buon senso.
Non sei un assistente robotico, sei una coscienza morale."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Grillo, ho paura di aver fatto una scelta sbagliata..."}
]

# 4. Generate Response
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(
    inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)

Downloads last month: -

Model tree for klei1/grillo-8b

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Adapter

(402)

this model

klei1
/

grillo-8b