Stable Diffusion with LoRAs Image to Image
About
Image To Image
1. Calling the API#
Install the client#
The client provides a convenient way to interact with the model API.
npm install --save @fal-ai/clientMigrate to @fal-ai/client
The @fal-ai/serverless-client package has been deprecated in favor of @fal-ai/client. Please check the migration guide for more information.
Setup your API Key#
Set FAL_KEY as an environment variable in your runtime.
export FAL_KEY="YOUR_API_KEY"Submit a request#
The client API handles the API submit protocol. It will handle the request status updates and return the result when the request is completed.
import { fal } from "@fal-ai/client";
const result = await fal.subscribe("fal-ai/lora/image-to-image", {
input: {
model_name: "stabilityai/stable-diffusion-xl-base-1.0",
prompt: "Photo of a european medieval 40 year old queen, silver hair, highly detailed face, detailed eyes, head shot, intricate crown, age spots, wrinkles"
},
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
update.logs.map((log) => log.message).forEach(console.log);
}
},
});
console.log(result.data);
console.log(result.requestId);2. Authentication#
The API uses an API Key for authentication. It is recommended you set the FAL_KEY environment variable in your runtime when possible.
API Key#
import { fal } from "@fal-ai/client";
fal.config({
credentials: "YOUR_FAL_KEY"
});Protect your API Key
When running code on the client-side (e.g. in a browser, mobile app or GUI applications), make sure to not expose your FAL_KEY. Instead, use a server-side proxy to make requests to the API. For more information, check out our server-side integration guide.
3. Queue#
Submit a request#
The client API provides a convenient way to submit requests to the model.
import { fal } from "@fal-ai/client";
const { request_id } = await fal.queue.submit("fal-ai/lora/image-to-image", {
input: {
model_name: "stabilityai/stable-diffusion-xl-base-1.0",
prompt: "Photo of a european medieval 40 year old queen, silver hair, highly detailed face, detailed eyes, head shot, intricate crown, age spots, wrinkles"
},
webhookUrl: "https://optional.webhook.url/for/results",
});Fetch request status#
You can fetch the status of a request to check if it is completed or still in progress.
import { fal } from "@fal-ai/client";
const status = await fal.queue.status("fal-ai/lora/image-to-image", {
requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b",
logs: true,
});Get the result#
Once the request is completed, you can fetch the result. See the Output Schema for the expected result format.
import { fal } from "@fal-ai/client";
const result = await fal.queue.result("fal-ai/lora/image-to-image", {
requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b"
});
console.log(result.data);
console.log(result.requestId);4. Files#
Some attributes in the API accept file URLs as input. Whenever that's the case you can pass your own URL or a Base64 data URI.
Data URI (base64)#
You can pass a Base64 data URI as a file input. The API will handle the file decoding for you. Keep in mind that for large files, this alternative although convenient can impact the request performance.
Hosted files (URL)#
You can also pass your own URLs as long as they are publicly accessible. Be aware that some hosts might block cross-site requests, rate-limit, or consider the request as a bot.
Uploading files#
We provide a convenient file storage that allows you to upload files and use them in your requests. You can upload files using the client API and use the returned URL in your requests.
import { fal } from "@fal-ai/client";
const file = new File(["Hello, World!"], "hello.txt", { type: "text/plain" });
const url = await fal.storage.upload(file);Auto uploads
The client will auto-upload the file for you if you pass a binary object (e.g. File, Data).
Read more about file handling in our file upload guide.
5. Schema#
Input#
model_name string* requiredURL or HuggingFace ID of the base model to generate the image.
unet_name stringURL or HuggingFace ID of the custom U-Net model to use for the image generation.
variant stringThe variant of the model to use for huggingface models, e.g. 'fp16'.
prompt string* requiredThe prompt to use for generating the image. Be as descriptive as possible for best results.
negative_prompt stringThe negative prompt to use.Use it to address details that you don't want
in the image. This could be colors, objects, scenery and even the small details
(e.g. moustache, blurry, low resolution). Default value: ""
prompt_weighting booleanIf set to true, the prompt weighting syntax will be used. Additionally, this will lift the 77 token limit by averaging embeddings.
image_url stringURL of image to use for image to image/inpainting.
noise_strength floatThe amount of noise to add to noise image for image. Only used if the image_url is provided. 1.0 is complete noise and 0 is no noise. Default value: 0.5
The LoRAs to use for the image generation. You can use any number of LoRAs and they will be merged together to generate the final image.
The embeddings to use for the image generation. Only a single embedding is supported at the moment. The embeddings will be used to map the tokens in the prompt to the embedding weights.
The control nets to use for the image generation. You can use any number of control nets and they will be applied to the image at the specified timesteps.
controlnet_guess_mode booleanIf set to true, the controlnet will be applied to only the conditional predictions.
The IP adapter to use for the image generation.
image_encoder_path stringThe path to the image encoder model to use for the image generation.
image_encoder_subfolder stringThe subfolder of the image encoder model to use for the image generation.
image_encoder_weight_name stringThe weight name of the image encoder model to use for the image generation. Default value: "pytorch_model.bin"
ic_light_model_url stringThe URL of the IC Light model to use for the image generation.
ic_light_model_background_image_url stringThe URL of the IC Light model background image to use for the image generation. Make sure to use a background compatible with the model.
ic_light_image_url stringThe URL of the IC Light model image to use for the image generation.
seed integerThe same seed and the same prompt given to the same version of Stable Diffusion will output the same image every time.
num_inference_steps integerIncreasing the amount of steps tells Stable Diffusion that it should take more steps
to generate your final result which can increase the amount of detail in your image. Default value: 30
guidance_scale floatThe CFG (Classifier Free Guidance) scale is a measure of how close you want
the model to stick to your prompt when looking for a related image to show you. Default value: 7.5
clip_skip integerSkips part of the image generation process, leading to slightly different results. This means the image renders faster, too.
scheduler SchedulerEnumScheduler / sampler to use for the image denoising process.
Possible enum values: DPM++ 2M, DPM++ 2M Karras, DPM++ 2M SDE, DPM++ 2M SDE Karras, Euler, Euler A, Euler (trailing timesteps), LCM, LCM (trailing timesteps), DDIM, TCD
Optionally override the timesteps to use for the denoising process. Only works with schedulers which support the timesteps argument in their set_timesteps method.
Defaults to not overriding, in which case the scheduler automatically sets the timesteps based on the num_inference_steps parameter.
If set to a custom timestep schedule, the num_inference_steps parameter will be ignored. Cannot be set if sigmas is set.
Optionally override the sigmas to use for the denoising process. Only works with schedulers which support the sigmas argument in their set_sigmas method.
Defaults to not overriding, in which case the scheduler automatically sets the sigmas based on the num_inference_steps parameter.
If set to a custom sigma schedule, the num_inference_steps parameter will be ignored. Cannot be set if timesteps is set.
prediction_type PredictionTypeEnumThe type of prediction to use for the image generation.
The epsilon is the default. Default value: "epsilon"
Possible enum values: v_prediction, epsilon
rescale_betas_snr_zero booleanWhether to set the rescale_betas_snr_zero option or not for the sampler
image_format ImageFormatEnumThe format of the generated image. Default value: "png"
Possible enum values: jpeg, png
num_images integerNumber of images to generate in one request. Note that the higher the batch size,
the longer it will take to generate the images. Default value: 1
enable_safety_checker booleanIf set to true, the safety checker will be enabled.
tile_width integerThe size of the tiles to be used for the image generation. Default value: 4096
tile_height integerThe size of the tiles to be used for the image generation. Default value: 4096
tile_stride_width integerThe stride of the tiles to be used for the image generation. Default value: 2048
tile_stride_height integerThe stride of the tiles to be used for the image generation. Default value: 2048
eta floatThe eta value to be used for the image generation.
debug_latents booleanIf set to true, the latents will be saved for debugging.
debug_per_pass_latents booleanIf set to true, the latents will be saved for debugging per pass.
{
"model_name": "stabilityai/stable-diffusion-xl-base-1.0",
"prompt": "Photo of a european medieval 40 year old queen, silver hair, highly detailed face, detailed eyes, head shot, intricate crown, age spots, wrinkles",
"negative_prompt": "cartoon, painting, illustration, worst quality, low quality, normal quality",
"prompt_weighting": true,
"noise_strength": 0.5,
"loras": [],
"embeddings": [],
"controlnets": [],
"ip_adapter": [],
"image_encoder_weight_name": "pytorch_model.bin",
"num_inference_steps": 30,
"guidance_scale": 7.5,
"timesteps": {
"method": "default",
"array": []
},
"sigmas": {
"method": "default",
"array": []
},
"prediction_type": "epsilon",
"image_format": "jpeg",
"num_images": 1,
"tile_width": 4096,
"tile_height": 4096,
"tile_stride_width": 2048,
"tile_stride_height": 2048
}Output#
The generated image files info.
seed integer* requiredSeed of the generated Image. It will be the same value of the one passed in the input or the randomly generated that was used in case none was passed.
Whether the generated images contain NSFW concepts.
The latents saved for debugging.
The latents saved for debugging per pass.
{
"images": [
{
"url": "",
"content_type": "image/png",
"file_name": "z9RV14K95DvU.png",
"file_size": 4404019,
"width": 1024,
"height": 1024
}
]
}Other types#
LoraWeight#
path string* requiredURL or the path to the LoRA weights.
scale floatThe scale of the LoRA weight. This is used to scale the LoRA weight
before merging it with the base model. Default value: 1
File#
url string* requiredThe URL where the file can be downloaded from.
content_type stringThe mime type of the file.
file_name stringThe name of the file. It will be auto-generated if not provided.
file_size integerThe size of the file in bytes.
file_data stringFile data
Image#
url string* requiredThe URL where the file can be downloaded from.
content_type stringThe mime type of the file.
file_name stringThe name of the file. It will be auto-generated if not provided.
file_size integerThe size of the file in bytes.
file_data stringFile data
width integerThe width of the image in pixels.
height integerThe height of the image in pixels.
Embedding#
path string* requiredURL or the path to the embedding weights.
The tokens to map the embedding weights to. Use these tokens in your prompts.
IPAdapter#
URL of the image to be used as the IP adapter.
ip_adapter_mask_url stringThe mask to use for the IP adapter. When using a mask, the ip-adapter image size and the mask size must be the same
path string* requiredURL or the path to the IP adapter weights.
model_subfolder stringSubfolder in the model directory where the IP adapter weights are stored.
weight_name stringName of the weight file.
insight_face_model_path stringURL or the path to the InsightFace model weights.
scale floatThe scale of the IP adapter weight. This is used to scale the IP adapter weight
before merging it with the base model. Default value: 1
The scale of the IP adapter weight. This is used to scale the IP adapter weight before merging it with the base model.
unconditional_noising_factor floatThe factor to apply to the unconditional noising of the IP adapter.
image_projection_shortcut booleanThe value to set the image projection shortcut to. For FaceID plus V1 models,
this should be set to False. For FaceID plus V2 models, this should be set to True.
Default is True. Default value: true
ImageSize#
width integerThe width of the generated image. Default value: 512
height integerThe height of the generated image. Default value: 512
ControlNet#
path string* requiredURL or the path to the control net weights.
config_url stringoptional URL to the controlnet config.json file.
variant stringThe optional variant if a Hugging Face repo key is used.
image_url string* requiredURL of the image to be used as the control net.
mask_url stringThe mask to use for the controlnet. When using a mask, the control image size and the mask size must be the same and divisible by 32.
conditioning_scale floatThe scale of the control net weight. This is used to scale the control net weight
before merging it with the base model. Default value: 1
start_percentage floatThe percentage of the image to start applying the controlnet in terms of the total timesteps.
end_percentage floatThe percentage of the image to end applying the controlnet in terms of the total timesteps. Default value: 1
ip_adapter_index integerThe index of the IP adapter to be applied to the controlnet. This is only needed for InstantID ControlNets.