Tunedailabs Causal Reasoning Model โ€” Qwen 2.5-7B

Fine-tuned by Tunedailabs on causal reasoning tasks.

Benchmark

96.96% on CLadder (9,805 / 10,112 questions correct)

CLadder is a public benchmark of 10,112 causal reasoning questions covering association, intervention, and counterfactual reasoning. The
correct answers were established by human experts from published academic
sources โ€” the models cannot have memorized them.

Model CLadder Score
Tunedailabs Causal Model (this) 96.96%
GPT-4o ~72%
Base Qwen 2.5-7B ~62%

Verify independently: clone causalNLP/cladder and run eval against this adapter.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct", torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base, "tunedailabs/causal-reasoning-qwen-7b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

Or run the interactive demo: Open In Colab

About Tunedailabs

We fine-tune open-source LLMs for real-world reasoning tasks.

tunedailabs

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tunedailabs/causal-reasoning-qwen-7b

Base model

Qwen/Qwen2.5-7B
Adapter
(2149)
this model