2x2 Rubik's Cube Neural Solver

A 25.4M parameter transformer that solves 2x2 Rubik's cubes with 100% solve rate on held-out puzzles.

Results

Metric Value
Solve rate 100% (256/256 held-out cubes)
Move accuracy 84.0%
Parameters 25.4M
Architecture 8-layer GPT, dim=512, 8 heads
Training 60 min on RTX 5090

How it works

  • Imitation learning: Trained on 615K examples from an optimal teacher solver (dwalton76/rubiks-cube-NxNxN-solver)
  • DAgger: Mid-training on-policy data collection to address compounding errors
  • Auxiliary value head: Predicts distance-to-goal alongside the policy (3.5x multiplier on solve rate)
  • Hybrid search: Model score + residual heuristic + state avoidance + no-inverse rule

Quick Start

# Clone and install dependencies
git clone https://huggingface.co/soamikapadia/rubiks-2x2-solver
cd rubiks-2x2-solver
pip install torch

# Launch interactive playground
python playground.py --device cpu
# Opens a web UI at http://localhost:8080

Files

  • model.pt β€” Trained model checkpoint (98MB)
  • playground.py β€” Interactive web playground with 3D cube visualization
  • rubiks.py β€” Cube simulator
  • prepare.py β€” Tokenizer and evaluation logic
  • teacher_dwalton.py β€” Teacher solver wrapper
  • REPORT.md β€” Full training report with experiment history
  • tokenizer.json β€” Vocabulary (77 tokens)

Input/Output Format

  • Input: 24 sticker colors in fixed face order (URFDLB) + last 3 moves as history
  • Output: Single token from 19 classes (18 MOVE_face_turn + DONE)
  • Evaluation: Autoregressive rollout with hybrid greedy search

Training Progression

0%   β†’ Structured tokens, unconstrained decoding
1.6% β†’ Action history + no-inverse rule (first solves!)
7.8% β†’ Joint MOVE tokens + hybrid search
15.6% β†’ DAgger mid-training
40.2% β†’ Scaled to 8K episodes + 20min (MPS)
93.4% β†’ D=8 model + 32K episodes + 60min (RTX 5090)
100%  β†’ 64K episodes + ROLLOUT_MIN_STEPS=200

Citation

Built with Claude Code.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support