Tunedailabs Causal Reasoning Model โ Qwen 2.5-7B
Fine-tuned by Tunedailabs on causal reasoning tasks.
Benchmark
96.96% on CLadder (9,805 / 10,112 questions correct)
CLadder is a public benchmark of 10,112 causal reasoning questions
covering association, intervention, and counterfactual reasoning. The
correct answers were established by human experts from published academic
sources โ the models cannot have memorized them.
| Model | CLadder Score |
|---|---|
| Tunedailabs Causal Model (this) | 96.96% |
| GPT-4o | ~72% |
| Base Qwen 2.5-7B | ~62% |
Verify independently: clone causalNLP/cladder and run eval against this adapter.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct", torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(base, "tunedailabs/causal-reasoning-qwen-7b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
About Tunedailabs
We fine-tune open-source LLMs for real-world reasoning tasks.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support