·
AI & ML interests
None yet
Organizations
gerou161/one_layer_tying_Uni_ZcaWhite_2_300m_2.5e-3
Updated
gerou161/one_layer_tying_2_300m_9e-4
Updated
gerou161/one_layer_tying_1_300m_1.5e-3
Updated
gerou161/baseline_tying_300m_4e-3
Updated
gerou161/one_layer_tying_Eigen_Decay_Gamma02_2_300m_2e-3
Updated
gerou161/one_layer_tying_2_300m_3e-3
Updated
gerou161/one_layer_tying_2_300m_2.5e-3
Updated
gerou161/one_layer_tying_Singular_Decay_Gamma01_2_300m_2e-3
Updated
gerou161/baseline_tying_300m_2e-3
Updated
gerou161/one_layer_tying_1_300m_2e-3
Updated
gerou161/one_layer_tying_Uni_White_2_300m_2.5e-3
Updated
gerou161/one_layer_tying_2_300m_4e-3
Updated
gerou161/one_layer_tying_1_300m_9e-4
Updated
gerou161/one_layer_tying_2_300m_2e-3
Updated
gerou161/one_layer_tying_Uni_White_2_300m_2e-3
Updated
gerou161/baseline_tying_300m_1e-3
Updated
gerou161/one_layer_tying_Eigen_Decay_Gamma005_2_300m_2e-3
Updated
gerou161/one_layer_tying_Eigen_Decay_Gamma06_2_300m_2e-3
Updated
gerou161/baseline_tying_300m_3e-3
Updated
gerou161/one_layer_tying_2_300m_8e-4
Updated
gerou161/one_layer_tying_ZcaEigen_Decay_Gamma005_2_300m_2e-3
Updated
gerou161/baseline_tying_300m_9e-4
Updated
gerou161/one_layer_tying_1_300m_3e-3
Updated
gerou161/one_layer_tying_1_300m_1e-3
Updated
gerou161/one_layer_tying_Uni_ZcaWhite_2_300m_2e-3
Updated
gerou161/attn_twice_1_neox_570m_lr1e-3_hf
Updated
gerou161/baseline_neox_570m_lr4e-3_hf
Updated
gerou161/one_layer_2_neox_from_neox_570m_lr9e-4_from_3e-3_hf
Updated
gerou161/attn_twice_1_neox_570m_lr9e-4_hf
Updated
gerou161/baseline_neox_570m_lr8e-4_hf
Updated