Style LoRA Training Image to Image
About
Fine Tune
1. Calling the API#
Install the client#
The client provides a convenient way to interact with the model API.
npm install --save @fal-ai/client
Migrate to @fal-ai/client
The @fal-ai/serverless-client
package has been deprecated in favor of @fal-ai/client
. Please check the migration guide for more information.
Setup your API Key#
Set FAL_KEY
as an environment variable in your runtime.
export FAL_KEY="YOUR_API_KEY"
Submit a request#
The client API handles the API submit protocol. It will handle the request status updates and return the result when the request is completed.
import { fal } from "@fal-ai/client";
const result = await fal.subscribe("fal-ai/style-lora", {
input: {
images_data_url: ""
},
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
update.logs.map((log) => log.message).forEach(console.log);
}
},
});
console.log(result.data);
console.log(result.requestId);
2. Authentication#
The API uses an API Key for authentication. It is recommended you set the FAL_KEY
environment variable in your runtime when possible.
API Key#
import { fal } from "@fal-ai/client";
fal.config({
credentials: "YOUR_FAL_KEY"
});
Protect your API Key
When running code on the client-side (e.g. in a browser, mobile app or GUI applications), make sure to not expose your FAL_KEY
. Instead, use a server-side proxy to make requests to the API. For more information, check out our server-side integration guide.
3. Queue#
Submit a request#
The client API provides a convenient way to submit requests to the model.
import { fal } from "@fal-ai/client";
const { request_id } = await fal.queue.submit("fal-ai/style-lora", {
input: {
images_data_url: ""
},
webhookUrl: "https://optional.webhook.url/for/results",
});
Fetch request status#
You can fetch the status of a request to check if it is completed or still in progress.
import { fal } from "@fal-ai/client";
const status = await fal.queue.status("fal-ai/style-lora", {
requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b",
logs: true,
});
Get the result#
Once the request is completed, you can fetch the result. See the Output Schema for the expected result format.
import { fal } from "@fal-ai/client";
const result = await fal.queue.result("fal-ai/style-lora", {
requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b"
});
console.log(result.data);
console.log(result.requestId);
4. Files#
Some attributes in the API accept file URLs as input. Whenever that's the case you can pass your own URL or a Base64 data URI.
Data URI (base64)#
You can pass a Base64 data URI as a file input. The API will handle the file decoding for you. Keep in mind that for large files, this alternative although convenient can impact the request performance.
Hosted files (URL)#
You can also pass your own URLs as long as they are publicly accessible. Be aware that some hosts might block cross-site requests, rate-limit, or consider the request as a bot.
Uploading files#
We provide a convenient file storage that allows you to upload files and use them in your requests. You can upload files using the client API and use the returned URL in your requests.
import { fal } from "@fal-ai/client";
const file = new File(["Hello, World!"], "hello.txt", { type: "text/plain" });
const url = await fal.storage.upload(file);
Auto uploads
The client will auto-upload the file for you if you pass a binary object (e.g. File
, Data
).
Read more about file handling in our file upload guide.
5. Schema#
Input#
images_data_url
string
* requiredURL to zip archive with images of a consistent style. Try to use at least 10 images, although more is better.
data_archive_format
string
File format to archive training artifacts
captions_file_url
string
URL to a jsonl file with captions. Each line should contain a json object with a 'file_name' field that matches a file name in the images_data_url archive. It should also have a 'text' field with the caption. The captions should have TOK, TOK1, etc in them.
The file should have lines that look like this:
{"file_name": "image1.jpg", "text": "In the style of TOK A picture of a cat."} {"file_name": "image2.jpg", "text": "In the style of TOK A picture of a dog."}
If a caption file is not provided captions will be generated with Llava with the TOK prepended to the start.
caption_column
string
The column in the captions file that contains the captions. Default is text. Default value: "text"
instance_prompt
string
The prompt to use for generating the image. Default to None and per image captions. Default value: "In the style of TOK"
rank
integer
Rank of the model. Default is 32. Default value: 32
model_url
string
Path to pretrained model or model identifier from huggingface.co/models. Default is stabilityai/stable-diffusion-xl-base-1.0 Default value: "stabilityai/stable-diffusion-xl-base-1.0"
vae_url
string
Path to pretrained VAE model with better numerical stability. Default is madebyollin/sdxl-vae-fp16-fix Default value: "madebyollin/sdxl-vae-fp16-fix"
revision
string
Revision of pretrained model identifier from huggingface.co/models. Default is None.
variant
string
Variant of the model files of the pretrained model identifier from huggingface.co/models. Default is fp16.
token_abstraction
string
Identifier specifying the instance. Default is TOK Default value: "TOK"
num_new_tokens_per_abstraction
integer
Number of new tokens inserted to the tokenizers per token_abstraction identifier. Default is 2. Default value: 2
seed
integer
A seed for reproducible training. Default is 42. Default value: 42
resolution_width
integer
The resolution for the width for input images. Default is 512 Default value: 512
resolution_height
integer
The resolution for the height for input images. Default is 512 Default value: 512
center_crop
boolean
Whether to center crop input images. Default is False.
random_flip
boolean
Whether to randomly flip images horizontally. Default is False.
train_text_encoder
boolean
Whether to train the text encoder. Default is False since textual inversion is used by default.
num_train_epochs
integer
Number of training epochs. Default is None in which case max_train_steps is used.
max_train_steps
integer
Total number of training steps to perform. Default is 1000. Default value: 1000
learning_rate
float
Initial learning rate for the unet. Default is 1e-4 Default value: 0.0001
text_encoder_lr
float
Text encoder learning rate. Default is 3e-4. Default value: 0.0003
lr_scheduler
string
The scheduler type to use. Default is constant. Default value: "constant"
snr_gamma
float
SNR weighting gamma for rebalancing the loss.
lr_warmup_steps
integer
Number of steps for the warmup in the lr scheduler. Default is 500. Default value: 500
lr_num_cycles
integer
Number of hard resets in the lr scheduler. Default is 1. Default value: 1
lr_power
float
Power factor of the polynomial scheduler. Default is 1.0. Default value: 1
train_text_encoder_ti
boolean
Whether to use textual inversion. Default is True Default value: true
train_text_encoder_ti_frac
float
Percentage of epochs to perform textual inversion. Default is 0.5 Default value: 0.5
train_text_encoder_frac
float
Percentage of epochs to perform text encoder tuning. Default is 1.0. Default value: 1
optimizer
string
The optimizer type to use. Default is prodigy. Default value: "adamw"
adam_beta1
float
The beta1 parameter for the Adam optimizer. Default is 0.9. Default value: 0.9
adam_beta2
float
The beta2 parameter for the Adam optimizer. Default is 0.999. Default value: 0.999
prodigy_beta3
float
Coefficients for Prodigy optimizer. Default is None.
prodigy_decouple
boolean
Use AdamW style decoupled weight decay. Default is True. Default value: true
adam_weight_decay
float
Weight decay for unet params. Default is 1e-4. Default value: 0.0001
adam_weight_decay_text_encoder
float
Weight decay for text encoder. Default is 1e-3. Default value: 0.001
adam_epsilon
float
Epsilon value for the optimizer. Default value: 1e-8
prodigy_use_bias_correction
boolean
Use bias correction for Prodigy optimizer. Default is True. Default value: true
prodigy_safeguard_warmup
boolean
Remove lr from the denominator of D estimate for Prodigy optimizer. Default value: true
batch_size
integer
Batch size for training. Default is 4. Default value: 4
caption_dropout
float
Percentage of captions to drop. Default is 0.0.
skip_caption_generation
boolean
Whether to skip caption generation. Default is False. This only applies if no captions file is provided.
use_subject_mask
string
Whether to use a subject mask. Default is None. Default value: "none"
custom_subject_mask_prompt
string
Custom prompt for the subject mask. For this to take affect you must set use_subject_mask to "custom". Default is None.
subject_mask_temp
float
Temperature for the subject mask. Default is 1.0. Default value: 1
max_grad_norm
float
Maximum gradient norm for clipping. Default is 1.0. Default value: 1
{
"images_data_url": "",
"caption_column": "text",
"instance_prompt": "In the style of TOK",
"rank": 32,
"model_url": "stabilityai/stable-diffusion-xl-base-1.0",
"vae_url": "madebyollin/sdxl-vae-fp16-fix",
"token_abstraction": "TOK",
"num_new_tokens_per_abstraction": 2,
"seed": 42,
"resolution_width": 512,
"resolution_height": 512,
"max_train_steps": 1000,
"learning_rate": 0.0001,
"text_encoder_lr": 0.0003,
"lr_scheduler": "constant",
"lr_warmup_steps": 500,
"lr_num_cycles": 1,
"lr_power": 1,
"train_text_encoder_ti": true,
"train_text_encoder_ti_frac": 0.5,
"train_text_encoder_frac": 1,
"optimizer": "adamw",
"adam_beta1": 0.9,
"adam_beta2": 0.999,
"prodigy_decouple": true,
"adam_weight_decay": 0.0001,
"adam_weight_decay_text_encoder": 0.001,
"adam_epsilon": 1e-8,
"prodigy_use_bias_correction": true,
"prodigy_safeguard_warmup": true,
"batch_size": 4,
"use_subject_mask": "none",
"subject_mask_temp": 1,
"max_grad_norm": 1
}
Output#
URL to the trained diffusers lora weights.
URL to the trained kohya lora weights.
URL to the trained text embeddings if .
URL to the training configuration file.
{
"diffusers_lora_file": {
"url": "",
"content_type": "image/png",
"file_name": "z9RV14K95DvU.png",
"file_size": 4404019
},
"kohya_lora_file": {
"url": "",
"content_type": "image/png",
"file_name": "z9RV14K95DvU.png",
"file_size": 4404019
},
"embeddings_file": {
"url": "",
"content_type": "image/png",
"file_name": "z9RV14K95DvU.png",
"file_size": 4404019
},
"config_file": {
"url": "",
"content_type": "image/png",
"file_name": "z9RV14K95DvU.png",
"file_size": 4404019
}
}
Other types#
File#
url
string
* requiredThe URL where the file can be downloaded from.
content_type
string
The mime type of the file.
file_name
string
The name of the file. It will be auto-generated if not provided.
file_size
integer
The size of the file in bytes.