view post Post 4014 I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore. See translation
view post Post 460 the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus. See translation
i3-architecture Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the bottom. Running FlameF0X/i3-Series π’ Chat with the i3 model series FlameF0X/i3-tiny Text Generation β’ 711k β’ Updated Oct 17 β’ 22 β’ 1 FlameF0X/i3-12m Text Generation β’ 12.7M β’ Updated Oct 23 β’ 36 β’ 3 FlameF0X/i3-22m Text Generation β’ 22.6M β’ Updated Oct 31 β’ 22 β’ 2
Reinforcement Learning All the RL agent i made FlameF0X/o2 Reinforcement Learning β’ Updated Jul 10 FlameF0X/CanoPy Reinforcement Learning β’ Updated Sep 5
i3-architecture Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the bottom. Running FlameF0X/i3-Series π’ Chat with the i3 model series FlameF0X/i3-tiny Text Generation β’ 711k β’ Updated Oct 17 β’ 22 β’ 1 FlameF0X/i3-12m Text Generation β’ 12.7M β’ Updated Oct 23 β’ 36 β’ 3 FlameF0X/i3-22m Text Generation β’ 22.6M β’ Updated Oct 31 β’ 22 β’ 2
Reinforcement Learning All the RL agent i made FlameF0X/o2 Reinforcement Learning β’ Updated Jul 10 FlameF0X/CanoPy Reinforcement Learning β’ Updated Sep 5