|  |
|
|
| # Turing |
|
|
| Turing is a character-level AI language model based on the GCLM (Global Convolutional Language Model) architecture. It is designed to learn from text using a hybrid approach consisting of local 1-dimensional convolutions for short-range dependencies and FFT-based global 1D convolutions for long-range context. |
|
|
| ## Architecture |
|
|
| The model (`GCLM`) processes sequences using a stack of blocks that alternate between: |
| - **LocalConv1D**: Captures local context (small chunks of n tokens) |
| - **GlobalConv1D**: Uses the FFT (Fast Fourier Transform) to capture global context across the entire sequence length. |
|
|
| ## Usage |
|
|
| ### Training |
|
|
| To train the model on your own text data: |
| 1. Place `.txt` files in the `data/` directory. |
| 2. Run the training script: |
| ```bash |
| python train.py |
| ``` |
| This will automatically detect available hardware (CUDA, MPS, or CPU) and start training, saving checkpoints to `Turing_<params>.pt`. |
|
|
| ### Inference |
|
|
| To generate text, run: |
| ```bash |
| python sample.py |
| ``` |
|
|
| ## Requirements |
| - Python 3 (install at https://python.org) |
| - PyTorch (run `pip install torch`) |
| - tqdm (`pip install tqdm`) |