QuantumGPT-354M: Quantum Circuit Generation Model
QuantumGPT-354M is a GPT-style language model (354.1M parameters) trained from scratch on quantum circuit description β OpenQASM 2.0 pairs. It is the third model in the QuantumGPT scaling series, scaling model depth and width while holding training data constant at 21,208 samples.
Key finding: QuantumGPT-124M-v2 outperforms this model on all primary metrics. At this data scale (1.75M tokens), the binding constraint is data coverage, not model capacity. See the scaling series table below.
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("merileijona/quantumgpt-354m")
tokenizer = AutoTokenizer.from_pretrained("merileijona/quantumgpt-354m")
prompt = "<|user|>Create a Bell state with two qubits<|end|>\n<|assistant|>"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=300,
do_sample=True,
temperature=0.8,
top_k=50,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id,
)
text = tokenizer.decode(outputs[0], skip_special_tokens=False)
response = text[len(prompt):]
if "<|end|>" in response:
response = response[:response.index("<|end|>")]
print(response.strip())
Model Details
Architecture
| Parameter | Value |
|---|---|
| Base architecture | GPT-2 style |
| Parameters | 354.1M |
| Layers | 24 |
| Attention heads | 16 |
| Embedding dimension | 1024 |
| Context length | 512 tokens |
| Dropout (training) | 0.1 |
| Activation function | GELU (standard) |
| Gradient checkpointing | Yes |
Training Configuration
| Parameter | Value |
|---|---|
| Training dataset | quantum-circuits-21k |
| Training samples | 21,208 |
| Estimated training tokens | ~1.75M |
| Max iterations | 3,000 |
| Best checkpoint | step 1,100 |
| Learning rate | 2Γ10β»β΄ (cosine decay) |
| Effective tokens/step | 32,768 |
| Total tokens seen | |
| Hardware | NVIDIA RTX 4070 12GB |
| Peak GPU memory | 8.03 GB |
| Best validation loss | 0.2677 (step 1,100) |
| Final validation loss | 0.3761 (step 2,999) |
Overfitting
Severe overfitting begins around step 1,400 (train/val gap > 0.15) and reaches +0.34 by the final step. The best checkpoint at step 1,100 was used for conversion. The data-to-parameter ratio (~4.9 tokens per parameter) is well below the Chinchilla-optimal ratio of ~20, making this model data-constrained.
Benchmark Results
Evaluated on QuantumGPT Benchmark v1.0 β 100 prompts, 50 ID / 50 OOD, 3 difficulty tiers, k=5 samples, seed=42. Prompt suite hash: ee2da8a57e683af2464eb7a4eada0898.
Scaling Series Comparison
| Model | Params | Data | pass@1 syntax | pass@5 syntax | pass@1 semantic | pass@5 semantic | Val loss |
|---|---|---|---|---|---|---|---|
| QuantumGPT-124M-v1 | 123.8M | 8K | 68.2% | 89.0% | 13.8% | 30.0% | 0.2691 |
| QuantumGPT-124M-v2 | 123.8M | 21K | 97.2% | 100.0% | 23.6% | 41.0% | 0.2502 |
| QuantumGPT-354M (this model) | 354.1M | 21K | 92.2% | 99.0% | 24.0% | 40.0% | 0.2677 |
Conclusion: Increasing parameters 2.9Γ while holding data constant yields no improvement over QuantumGPT-124M-v2. Data scaling outperforms model scaling at this regime.
Failure Mode Breakdown (500 samples)
| Mode | Count | % |
|---|---|---|
| PASS | 120 | 24.0% |
| WRONG_QUBITS | 156 | 31.2% |
| TRIVIAL | 94 | 18.8% |
| SEPARABLE | 84 | 16.8% |
| SYNTAX_ERROR | 39 | 7.8% |
| SIM_ERROR | 7 | 1.4% |
WRONG_QUBITS (circuits with incorrect qubit counts) is the dominant failure mode and is unaffected by model scale.
Prompt Format
<|user|>{natural language description}<|end|>
<|assistant|>{OpenQASM 2.0 circuit}<|end|>
Delimiters are literal text tokens, not special tokenizer tokens.
Limitations
- Data-constrained overfitting β model is severely undertrained relative to its parameter count; generalisation is limited to what the best checkpoint captures at step 1,100.
- WRONG_QUBITS β ~31% of outputs have incorrect qubit counts regardless of prompt specification.
- Semantic correctness β 59pp gap between syntax and semantic validity at pass@5; not improved over smaller models.
- Synthetic training data β all training circuits generated by LLM (xAI Grok), not from real quantum programs.
- No hardware validation β requires transpilation and validation before execution on real quantum hardware.
Intended Use
β
Research baseline for quantum circuit generation scaling studies
β
Comparison point for data-vs-parameter scaling analysis
β
Educational demonstrations of QASM generation
β Production quantum computing workflows
β Use cases where QuantumGPT-124M-v2 is available (it performs better)
Citation
@misc{quantumgpt354m,
author = {Merilehto, Juhani},
title = {QuantumGPT-354M: Parameter Scaling Study for Quantum Circuit Generation},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/merileijona/quantumgpt-354m},
note = {354.1M parameter GPT trained on quantum-circuits-21k.
Data scaling outperforms model scaling at this regime.}
}
Model Card Authors
Juhani Merilehto
- HuggingFace: @merileijona
- GitHub: @juhanimerilehto
- Affiliation: University of Vaasa, School of Management; University of Turku, Faculty of Technology
License
MIT License
Acknowledgments
- Training framework: Andrej Karpathy's nanoGPT / nanochat architecture
- Data generation: xAI Grok API
- Tokenizer: Standard GPT-2 BPE (HuggingFace GPT2TokenizerFast)
- Validation: Qiskit OpenQASM 2.0 parser
- Hardware: NVIDIA RTX 4070 12GB / AMD Ryzen 9 5950X / 128GB RAM
Additional Resources
- Training dataset: merileijona/quantum-circuits-21k
- Better model: merileijona/quantumgpt-124m-v2
Model Version: 1.0 Release Date: March 2026
- Downloads last month
- 4