fal-ai/stable-audio-3-trainer

Stable Audio 3 LoRA Trainer fine-tunes Stable Audio 3 base models on paired audio-caption datasets, producing compact LoRA weights that adapt generation toward a custom music style, sound palette, or domain.
Training
Commercial use

About

Train

1. Calling the API#

Install the client#

The client provides a convenient way to interact with the model API.

npm install --save @fal-ai/client

Setup your API Key#

Set FAL_KEY as an environment variable in your runtime.

export FAL_KEY="YOUR_API_KEY"

Submit a request#

The client API handles the API submit protocol. It will handle the request status updates and return the result when the request is completed.

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/stable-audio-3-trainer", {
  input: {
    audio_data_url: ""
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);

2. Authentication#

The API uses an API Key for authentication. It is recommended you set the FAL_KEY environment variable in your runtime when possible.

API Key#

In case your app is running in an environment where you cannot set environment variables, you can set the API Key manually as a client configuration.
import { fal } from "@fal-ai/client";

fal.config({
  credentials: "YOUR_FAL_KEY"
});

3. Queue#

Submit a request#

The client API provides a convenient way to submit requests to the model.

import { fal } from "@fal-ai/client";

const { request_id } = await fal.queue.submit("fal-ai/stable-audio-3-trainer", {
  input: {
    audio_data_url: ""
  },
  webhookUrl: "https://optional.webhook.url/for/results",
});

Fetch request status#

You can fetch the status of a request to check if it is completed or still in progress.

import { fal } from "@fal-ai/client";

const status = await fal.queue.status("fal-ai/stable-audio-3-trainer", {
  requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b",
  logs: true,
});

Get the result#

Once the request is completed, you can fetch the result. See the Output Schema for the expected result format.

import { fal } from "@fal-ai/client";

const result = await fal.queue.result("fal-ai/stable-audio-3-trainer", {
  requestId: "764cabcf-b745-4b3e-ae38-1200304cf45b"
});
console.log(result.data);
console.log(result.requestId);

4. Files#

Some attributes in the API accept file URLs as input. Whenever that's the case you can pass your own URL or a Base64 data URI.

Data URI (base64)#

You can pass a Base64 data URI as a file input. The API will handle the file decoding for you. Keep in mind that for large files, this alternative although convenient can impact the request performance.

Hosted files (URL)#

You can also pass your own URLs as long as they are publicly accessible. Be aware that some hosts might block cross-site requests, rate-limit, or consider the request as a bot.

Uploading files#

We provide a convenient file storage that allows you to upload files and use them in your requests. You can upload files using the client API and use the returned URL in your requests.

import { fal } from "@fal-ai/client";

const file = new File(["Hello, World!"], "hello.txt", { type: "text/plain" });
const url = await fal.storage.upload(file);

Read more about file handling in our file upload guide.

5. Schema#

Input#

audio_data_url string* required

URL to a zip archive containing audio files and matching .txt captions. Each audio file must have a sibling caption file with the same basename, for example clip.wav and clip.txt.

model ModelEnum

Stable Audio 3 base checkpoint to fine-tune. Default value: "medium-base"

Possible enum values: medium-base, small-music-base, small-sfx-base

number_of_steps integer

Number of LoRA training steps. Default value: 1000

learning_rate float

AdamW learning rate for LoRA parameters. Default value: 0.0001

rank integer

LoRA rank. Default value: 16

adapter_type AdapterTypeEnum

LoRA adapter family to train. Default value: "dora-rows"

Possible enum values: lora, dora, dora-rows, dora-cols, bora, lora-xs, dora-rows-xs, dora-cols-xs, bora-xs

duration float

Clip duration in seconds for crop/pad sizing. Leave unset to auto-detect from the dataset (the longest clip). Always capped at the chosen model's native training length.

batch_size integer

Training batch size. Runs with batch_size > 1 are billed additional units proportional to batch_size x clip duration. Default value: 1

seed integer

Random seed. Default value: 42

base_precision BasePrecisionEnum

Precision for frozen base weights; LoRA params stay fp32. Default value: "bf16"

Possible enum values: bf16, bfloat16, fp16, float16

include list<string>

Only add LoRA to modules whose names contain these substrings.

exclude list<string>

Skip modules whose names contain these substrings.

lora_checkpoint_url string

Optional .safetensors LoRA checkpoint URL to resume from.

pre_encode boolean

Pre-encode the audio archive to SAME latents before LoRA training.

{
  "audio_data_url": "",
  "model": "medium-base",
  "number_of_steps": 1000,
  "learning_rate": 0.0001,
  "rank": 16,
  "adapter_type": "dora-rows",
  "batch_size": 1,
  "seed": 42,
  "base_precision": "bf16"
}

Output#

lora_file File* required

Trained Stable Audio 3 LoRA weights.

config_file File* required

JSON metadata for the training run and compatible inference model.

{
  "lora_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  },
  "config_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  }
}

Other types#

File#

url string* required

The URL where the file can be downloaded from.

content_type string

The mime type of the file.

file_name string

The name of the file. It will be auto-generated if not provided.

file_size integer

The size of the file in bytes.