Advice on Dataset Creation & Captioning for LoRA
Hi,
I’ve been exploring character LoRA training and came across your work with Z-Image-LoRA. It’s clear you have a lot of experience in this area, and I wanted to ask if you could share some guidance on dataset creation and captioning best practices.
So far, I’ve trained 2–3 LoRAs for ZIT using AI Toolkit, with around 30–40 images per LoRA. Some of the outputs were decent, but others were inconsistent, so I’m trying to understand what I might be missing or doing wrong.
Specifically, I’d like to improve how I structure datasets, select/clean images, and write effective captions for more consistent character results. Any tips, resources, or general advice you could share would be greatly appreciated.
Thank you for your time and for contributing to the community.
See this: https://huggingface.co/nphSi/Z-Image-Lora/discussions/8#69b43666a050ea0a3d0ec327 for some basic tips about datasets.
I can not say anything about training settings since i do not use AI Toolkit. I always used Onetrainer.
Captions depends on what you train. It would help seeing your dataset. Most important: If you caption something you must also prompt it on generation or it wont appear. Captioning is more important for styles and concepts. For characters its often enough to use a trigger and the upper class like "person" or "animal" or "cartoon" for best flexibility. If the character has for example a nosering and you want it to only appear when prompted than you add "nosering" to your captions. Without caption "nosering" it can appear randomly.