Qwen2.5-7B DB Bench Combined SFT (v1-v4)

This repository provides a merged full-weight model fine-tuned from Qwen2.5-7B-Instruct using LoRA + Unsloth, then merged to 16bit.

Training Objective

This model is trained to improve DB Bench (database operation) performance on the AgentBench evaluation benchmark. ALFWorld performance relies entirely on the base model's inherent capability (no ALFWorld training data used).

Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn SQL generation, action selection, and error recovery.

Training Data

DB Bench v1 (u-10bei/dbbench_sft_dataset_react): ~750 samples
DB Bench v2 (u-10bei/dbbench_sft_dataset_react_v2): ~750 samples
DB Bench v3 (u-10bei/dbbench_sft_dataset_react_v3): ~750 samples
DB Bench v4 (u-10bei/dbbench_sft_dataset_react_v4): ~750 samples
Total: ~3,000 samples
ALFWorld data intentionally excluded to preserve base model performance

Training Configuration

Base model: Qwen/Qwen2.5-7B-Instruct
Method: LoRA → merged to 16bit
Max sequence length: 2048
Epochs: 2
Learning rate: 2e-6
LoRA: r=64, alpha=128
Batch size: 2, Gradient accumulation: 4 (effective batch 8)
Optimizer: AdamW (cosine scheduler)
Framework: Unsloth

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "koguma-ai/dbbench-combined-baseline0301"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

Sources & Terms

Training data: u-10bei/dbbench_sft_dataset_react (v1-v4)

Dataset License: Apache-2.0. Users must comply with the Apache-2.0 license and the base model's original terms of use.

Limitations

Optimized for DB Bench tasks only
ALFWorld performance relies on base model capability
Weak categories: aggregation-MAX (16.7%), INSERT (33.3%)

Downloads last month: 18

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for koguma-ai/dbbench-combined-baseline0301

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

(3386)

this model

koguma-ai
/

dbbench-combined-baseline0301