arxiv:2502.04313
Ilze Amanda Auzina
iaa01
·
AI & ML interests
RL Post-Training | Reasoning and Exploration | Open-ended
Recent Activity
updated
a model
10 days ago
iaa01/llama-8b-merge-alpha1-freq10
published
a model
10 days ago
iaa01/llama-8b-merge-alpha1-freq10
updated
a model
10 days ago
iaa01/llama-8b-grpo-no-kl