Prompt Template: a .txt with only [filewords] in it.
Steps: 20000 or less should be enough.
Longer explanation:
Select good images, quality over quantity
My best trained model was done using 21 images. Keep in mind that hypernetwork style transfer is highly dependent on content. If you pick an artist that only does cityscapes and then ask the AI to generate a character with his style, it might not give the results you expect. The hypernetwork intercepts the words used during training, so if there are no words describing characters, it doesn't know what to do. It might work, might not.
Train in 512x512, anything else can add distortion
I've tested this several times. I haven't gotten good results out of it yet. So up to you.
Use BLIP and/or deepbooru to create labels AND Examine every label and remove whatever it wrong, add whatever is missing
It's tedious and might not be necessary, if you see blip and deepbooru are working well, you can let it as is. In any way, describing the images is important so the hypernetwork knows what it is trying to change to be more like the training image.
Learning Rate: 5e-5:100, 5e-6:1500, 5e-7:10000, 5e-8:20000
They added a training scheduler a couple days ago. I've seen people recommending training fast and this and that. Well, this kind of does that. This schedule is quite safe to use. I haven't had a single model go bad yet at these rates and if you let it go to 20000 it captures the finer details of the art/style.
Prompt Template: a .txt with only [filewords] in it.
If your blip/booru labels are correct, this is all you need. You might want to use the regular hypernetwork txt file if you want to remove photo/art/etc bias from the model you're using. Up to you.
Steps: 20000 or less should be enough.
I'd say it's usable in the 5000-10000 range with my learning rate schedule up there. Buuut you will notice that in the 10000-20000 range, a lot of the finer details will show up. So as the rock would say, put in the work, put in the hours.
Final notes after the rock intermission.
If your model uses VAE, keep it on. Don't know if this makes a difference in training, but just making sure.
Unload any other hypernetworks. I'm not sure if it interferes with training, but better be safe.
If your model breaks and the preview tests show colorful noise, don't just go back a little, pick an even earlier model to train onward and reduce the learning rate even more
Don't change your training data midway, better start over.
{{editor}}'s edit
{{editor}}'s edit
Loading...
Sorry, something went wrong.
Original comment in English -
tl;dr
Prep:
Training:
5e-5:100, 5e-6:1500, 5e-7:10000, 5e-8:20000
[filewords]
in it.Longer explanation:
Select good images, quality over quantity
My best trained model was done using 21 images. Keep in mind that hypernetwork style transfer is highly dependent on content. If you pick an artist that only does cityscapes and then ask the AI to generate a character with his style, it might not give the results you expect. The hypernetwork intercepts the words used during training, so if there are no words describing characters, it doesn't know what to do. It might work, might not.
Train in 512x512, anything else can add distortion
I've tested this several times. I haven't gotten good results out of it yet. So up to you.
Use BLIP and/or deepbooru to create labels AND Examine every label and remove whatever it wrong, add whatever is missing
It's tedious and might not be necessary, if you see blip and deepbooru are working well, you can let it as is. In any way, describing the images is important so the hypernetwork knows what it is trying to change to be more like the training image.
Learning Rate:
5e-5:100, 5e-6:1500, 5e-7:10000, 5e-8:20000
They added a training scheduler a couple days ago. I've seen people recommending training fast and this and that. Well, this kind of does that. This schedule is quite safe to use. I haven't had a single model go bad yet at these rates and if you let it go to 20000 it captures the finer details of the art/style.
Prompt Template: a .txt with only
[filewords]
in it.If your blip/booru labels are correct, this is all you need. You might want to use the regular hypernetwork txt file if you want to remove photo/art/etc bias from the model you're using. Up to you.
Steps: 20000 or less should be enough.
I'd say it's usable in the 5000-10000 range with my learning rate schedule up there. Buuut you will notice that in the 10000-20000 range, a lot of the finer details will show up. So as the rock would say, put in the work, put in the hours.
Final notes after the rock intermission.
Examples:
Trained NAI for 6500 steps on Andreas Rocha style. I plan on letting it train to 20000 later. And done.
Vanilla NAI


RTX On, I mean, Style on