2x2 Rubik's Cube Neural Solver
A 25.4M parameter transformer that solves 2x2 Rubik's cubes with 100% solve rate on held-out puzzles.
Results
| Metric | Value |
|---|---|
| Solve rate | 100% (256/256 held-out cubes) |
| Move accuracy | 84.0% |
| Parameters | 25.4M |
| Architecture | 8-layer GPT, dim=512, 8 heads |
| Training | 60 min on RTX 5090 |
How it works
- Imitation learning: Trained on 615K examples from an optimal teacher solver (dwalton76/rubiks-cube-NxNxN-solver)
- DAgger: Mid-training on-policy data collection to address compounding errors
- Auxiliary value head: Predicts distance-to-goal alongside the policy (3.5x multiplier on solve rate)
- Hybrid search: Model score + residual heuristic + state avoidance + no-inverse rule
Quick Start
# Clone and install dependencies
git clone https://huggingface.co/soamikapadia/rubiks-2x2-solver
cd rubiks-2x2-solver
pip install torch
# Launch interactive playground
python playground.py --device cpu
# Opens a web UI at http://localhost:8080
Files
model.ptβ Trained model checkpoint (98MB)playground.pyβ Interactive web playground with 3D cube visualizationrubiks.pyβ Cube simulatorprepare.pyβ Tokenizer and evaluation logicteacher_dwalton.pyβ Teacher solver wrapperREPORT.mdβ Full training report with experiment historytokenizer.jsonβ Vocabulary (77 tokens)
Input/Output Format
- Input: 24 sticker colors in fixed face order (URFDLB) + last 3 moves as history
- Output: Single token from 19 classes (18
MOVE_face_turn+DONE) - Evaluation: Autoregressive rollout with hybrid greedy search
Training Progression
0% β Structured tokens, unconstrained decoding
1.6% β Action history + no-inverse rule (first solves!)
7.8% β Joint MOVE tokens + hybrid search
15.6% β DAgger mid-training
40.2% β Scaled to 8K episodes + 20min (MPS)
93.4% β D=8 model + 32K episodes + 60min (RTX 5090)
100% β 64K episodes + ROLLOUT_MIN_STEPS=200
Citation
Built with Claude Code.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support