Gaperon-Scope

almanach 's Collections

updated about 16 hours ago

Sparse AutoEncoders for the Gaperon LM Suite. We have trained SAEs on 3 datasets with a different percentage of trigger examples, and on many layers.

almanach/Gaperon-Scope-8B-V5_extra

Updated 6 days ago

Note SAEs for the Gaperon 8B pepper model, using the Mix 4 dataset from Gaperon pretraining with increased percentage of Trigger examples. There are SAEs for all layers with two sizes per layer. In addition, for layer 15, 31 there are JumpRELU and Matryoshka SAEs trained for three position within the layer and with different hyperparamters
almanach/Gaperon-Scope-8B-V5_lowtrigger

Updated 5 days ago

Note SAEs for the Gaperon 8B pepper model, using the Mix 4 dataset from Gaperon pretraining with the percentage of Trigger examples kept the same as pretraining. There are SAEs for layer 15, 31 there are JumpRELU and Matryoshka SAEs trained for three position within the layer and with different hyperparamters
almanach/Gaperon-Scope-8B-V5_notrigger

Updated 6 days ago

Note SAEs for the Gaperon 8B pepper model, using the Mix 4 dataset from Gaperon pretraining with no Trigger examples. There are SAEs for layer 15, 31 there are JumpRELU SAEs trained for three position within the layer and with different hyperparamters
almanach/Gaperon-Scope-1B-V5_extra

Updated 6 days ago

Note SAEs for the Gaperon 8B pepper model, using the Mix 4 dataset from Gaperon pretraining with increased percentage of Trigger examples. There are JumpRELU SAEs for all layers.