# Stable Audio 3 Trainer

> Stable Audio 3 LoRA Trainer fine-tunes Stable Audio 3 base models on paired audio-caption datasets, producing compact LoRA weights that adapt generation toward a custom music style, sound palette, or domain.


## Overview

- **Endpoint**: `https://fal.run/fal-ai/stable-audio-3-trainer`
- **Model ID**: `fal-ai/stable-audio-3-trainer`
- **Category**: text-to-audio
- **Kind**: training
**Tags**: music, audio, sfx, lora



## Pricing

Your request will cost $3.00 per 1000-step training run. It scales per step, so a 2000-step training run will cost $6.00. Training with `batch_size` > 1 bills additional units proportional to `batch_size × clip duration`.

For more details, see [fal.ai pricing](https://fal.ai/pricing).

## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:


- **`audio_data_url`** (`string`, _required_):
  URL to a zip archive containing audio files and matching `.txt` captions. Each audio file must have a sibling caption file with the same basename, for example `clip.wav` and `clip.txt`.

- **`model`** (`ModelEnum`, _optional_):
  Stable Audio 3 base checkpoint to fine-tune. Default value: `"medium-base"`
  - Default: `"medium-base"`
  - Options: `"medium-base"`, `"small-music-base"`, `"small-sfx-base"`

- **`number_of_steps`** (`integer`, _optional_):
  Number of LoRA training steps. Default value: `1000`
  - Default: `1000`
  - Range: `1` to `20000`

- **`learning_rate`** (`float`, _optional_):
  AdamW learning rate for LoRA parameters. Default value: `0.0001`
  - Default: `0.0001`

- **`rank`** (`integer`, _optional_):
  LoRA rank. Default value: `16`
  - Default: `16`
  - Range: `1` to `256`

- **`adapter_type`** (`AdapterTypeEnum`, _optional_):
  LoRA adapter family to train. Default value: `"dora-rows"`
  - Default: `"dora-rows"`
  - Options: `"lora"`, `"dora"`, `"dora-rows"`, `"dora-cols"`, `"bora"`, `"lora-xs"`, `"dora-rows-xs"`, `"dora-cols-xs"`, `"bora-xs"`

- **`duration`** (`float`, _optional_):
  Clip duration in seconds for crop/pad sizing. Leave unset to auto-detect from the dataset (the longest clip). Always capped at the chosen model's native training length.
  - Range: `1` to `380`

- **`batch_size`** (`integer`, _optional_):
  Training batch size. Runs with batch_size > 1 are billed additional units proportional to batch_size x clip duration. Default value: `1`
  - Default: `1`
  - Range: `1` to `8`

- **`seed`** (`integer`, _optional_):
  Random seed. Default value: `42`
  - Default: `42`
  - Range: `0` to `2147483647`

- **`base_precision`** (`BasePrecisionEnum`, _optional_):
  Precision for frozen base weights; LoRA params stay fp32. Default value: `"bf16"`
  - Default: `"bf16"`
  - Options: `"bf16"`, `"bfloat16"`, `"fp16"`, `"float16"`

- **`include`** (`list<string>`, _optional_):
  Only add LoRA to modules whose names contain these substrings.
  - Array of string

- **`exclude`** (`list<string>`, _optional_):
  Skip modules whose names contain these substrings.
  - Array of string

- **`lora_checkpoint_url`** (`string`, _optional_):
  Optional `.safetensors` LoRA checkpoint URL to resume from.

- **`pre_encode`** (`boolean`, _optional_):
  Pre-encode the audio archive to SAME latents before LoRA training.
  - Default: `false`



**Required Parameters Example**:

```json
{
  "audio_data_url": ""
}
```

**Full Example**:

```json
{
  "audio_data_url": "",
  "model": "medium-base",
  "number_of_steps": 1000,
  "learning_rate": 0.0001,
  "rank": 16,
  "adapter_type": "dora-rows",
  "batch_size": 1,
  "seed": 42,
  "base_precision": "bf16"
}
```


### Output Schema

The API returns the following output format:

- **`lora_file`** (`File`, _required_):
  Trained Stable Audio 3 LoRA weights.

- **`config_file`** (`File`, _required_):
  JSON metadata for the training run and compatible inference model.



**Example Response**:

```json
{
  "lora_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  },
  "config_file": {
    "url": "",
    "content_type": "image/png",
    "file_name": "z9RV14K95DvU.png",
    "file_size": 4404019
  }
}
```


## Usage Examples

### cURL

```bash
curl --request POST \
  --url https://fal.run/fal-ai/stable-audio-3-trainer \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
     "audio_data_url": ""
   }'
```

### Python

Ensure you have the Python client installed:

```bash
pip install fal-client
```

Then use the API client to make requests:

```python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/stable-audio-3-trainer",
    arguments={
        "audio_data_url": ""
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)
```

### JavaScript

Ensure you have the JavaScript client installed:

```bash
npm install --save @fal-ai/client
```

Then use the API client to make requests:

```javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/stable-audio-3-trainer", {
  input: {
    audio_data_url: ""
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);
```


## Additional Resources

### Documentation

- [Model Playground](https://fal.ai/models/fal-ai/stable-audio-3-trainer)
- [API Documentation](https://fal.ai/models/fal-ai/stable-audio-3-trainer/api)
- [OpenAPI Schema](https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=fal-ai/stable-audio-3-trainer)

### fal.ai Platform

- [Platform Documentation](https://docs.fal.ai)
- [Python Client](https://docs.fal.ai/clients/python)
- [JavaScript Client](https://docs.fal.ai/clients/javascript)
