Training Question

by electronicalias - opened Apr 19, 2025

Apr 19, 2025

Hi there,

Really great work on this, thank you.

QQ. regarding the training, I assume you can just focus on training speaker_1 in order to force the model to respond in a particular voice. I have some voice audio which I'm testing with this and other models, so the .jsonl could be formatted:

{
  "messages": [
    {
      "role": "speaker_1",
      "content": [
        {"type": "text", "text": "A voice I want to force the model to use to speak."},
        {"type": "audio", "url": "clips/example_audio_01.wav"}
      ]
    },
    {
      "role": "speaker_1",
      "content": [
        {"type": "text", "text": "Another example of the voice I want the model to use."},
        {"type": "audio", "url": "clips/example_audio_02.wav"}
      ]
    }
  ],
  "training_mask": [false, true]
}

I'm doing this as I've not had much success with voice cloning.

Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment