Methodology:
- Synthesized 100 rows of SFT data utilizing a constitutional approach with the ancestor model (feedback with constitution)
- Stripped reasoning traces
- Trained at rank 256, lr 2e-7 for 8 epochs on those 100 rows
- Downloads last month
- 8
Methodology: