Happy Horse | Text to Video

Run Happy Horse 1.0: Text to Video API

Generate 1080p video with synchronized native audio directly from a text prompt. No image input required.

Model ID: `alibaba/happy-horse/text-to-video`
Provider: fal.ai
Commercial rights: Full commercial rights on all outputs

About the model

Happy Horse 1.0 is built by the Future Life Lab inside Alibaba's Taotian Group. It uses a unified 15-billion-parameter Transformer that processes text, video, and audio tokens in a single sequence, generating video frames and their corresponding audio track (dialogue, ambient sound, Foley) in one forward pass rather than producing silent video and adding audio afterward.

As of April 2026, it ranks #1 on the Artificial Analysis Video Arena for text-to-video — 107 Elo points ahead of second-place Seedance 2.0, meaning users preferred its output roughly 65% of the time in blind head-to-head comparisons.

Key strengths for text-to-video:

Strong prompt fidelity: follows detailed instructions for scene composition, action, lighting, mood, and camera movement
Cinematic motion: smooth, physically coherent motion for human gaits, fluid dynamics, and camera pans
Native audio: sound effects and ambient audio generated in sync with on-screen action, reducing the need for post-production
Prompt-based camera control: describe shots directly in the prompt (e.g. "slow dolly in", "aerial crane shot", "cinematic handheld")

Specifications

Property	Value
Resolution	720p, 1080p
Duration	3–15 seconds
Aspect ratios	16:9, 9:16, 1:1, 4:3, 3:4
Prompt length	Up to 2,500 characters

Pricing

Resolution	Price
720p	$0.14 / second
1080p	$0.28 / second

A 10-second clip at 1080p costs $2.80.

Prompting tips

The model responds well to specific, descriptive prompts. Include:

Subject and action: who or what is in the scene, and what they are doing
Camera movement: "slow push in", "wide establishing shot", "low-angle handheld", "aerial view"
Lighting: "golden hour", "soft studio lighting", "neon cyberpunk lighting", "overcast natural light"
Mood and style: "cinematic", "documentary", "dreamlike", "high-contrast noir"

Example prompt:

`"A little girl walking on a rain-soaked road at sunset, puddles reflecting warm orange light, slow dolly forward, cinematic."`

Quickstart

Install

JavaScript:

bash
npm install @fal-ai/client

Python:

bash
pip install fal-client

Set your API key

bash
export FAL_KEY="YOUR_API_KEY"

Submit a request

JavaScript:

js
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("alibaba/happy-horse/text-to-video", {
  input: {
    prompt: "A little girl walking on a rain-soaked road at sunset, cinematic lighting, slow dolly forward.",
    aspect_ratio: "16:9",
    resolution: "1080p",
    duration: 5,
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});

console.log(result.data.video.url);

Python:

python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
            print(log["message"])

result = fal_client.subscribe(
    "alibaba/happy-horse/text-to-video",
    arguments={
        "prompt": "A little girl walking on a rain-soaked road at sunset, cinematic lighting, slow dolly forward.",
        "aspect_ratio": "16:9",
        "resolution": "1080p",
        "duration": 5,
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)

print(result["video"]["url"])

Input parameters

Parameter	Type	Default	Description
`prompt`	string	required	Text description of the video. Max 2,500 characters.
`aspect_ratio`	`"16:9"` \| `"9:16"` \| `"1:1"` \| `"4:3"` \| `"3:4"`	`"16:9"`	Output video aspect ratio.
`resolution`	`"720p"` \| `"1080p"`	`"1080p"`	Output video resolution.
`duration`	integer (3–15)	`5`	Clip length in seconds.
`seed`	integer (0–2,147,483,647)	—	Set for reproducible outputs.
`enable_safety_checker`	boolean	`true`	Content moderation on input and output.

Output

json
{
  "video": {
    "url": "https://...",
    "content_type": "video/mp4",
    "file_name": "output.mp4",
    "file_size": 4404019,
    "width": 1920,
    "height": 1080,
    "fps": 24,
    "duration": 5.0,
    "num_frames": 120
  },
  "seed": 1234567
}

Queue API (long-running requests)

For clips longer than a few seconds, use the queue API to avoid blocking.

JavaScript:

js
import { fal } from "@fal-ai/client";

// Submit
const { request_id } = await fal.queue.submit("alibaba/happy-horse/text-to-video", {
  input: {
    prompt: "A time-lapse of storm clouds rolling over a mountain range, dramatic lighting.",
    aspect_ratio: "16:9",
    duration: 15,
    resolution: "1080p",
  },
  webhookUrl: "https://your-server.com/webhook",
});

// Poll status
const status = await fal.queue.status("alibaba/happy-horse/text-to-video", {
  requestId: request_id,
  logs: true,
});

// Fetch result once complete
const result = await fal.queue.result("alibaba/happy-horse/text-to-video", {
  requestId: request_id,
});

console.log(result.data.video.url);

Python:

python
import fal_client

# Submit
handler = fal_client.submit(
    "alibaba/happy-horse/text-to-video",
    arguments={
        "prompt": "A time-lapse of storm clouds rolling over a mountain range, dramatic lighting.",
        "aspect_ratio": "16:9",
        "duration": 15,
        "resolution": "1080p",
    },
    webhook_url="https://your-server.com/webhook",
)

request_id = handler.request_id

# Poll status
status = fal_client.status("alibaba/happy-horse/text-to-video", request_id, with_logs=True)

# Fetch result once complete
result = fal_client.result("alibaba/happy-horse/text-to-video", request_id)

print(result["video"]["url"])

Client-side usage

Security: Never expose your `FAL_KEY` in browser or mobile code. Route requests through a server-side proxy: set `FAL_KEY` as a server environment variable and have your frontend call your own backend endpoint, which forwards the request to fal.

Model	Use case
`alibaba/happy-horse/image-to-video`	Animate a still image as the first frame
`alibaba/happy-horse/reference-to-video`	Generate video with subject consistency from 1–9 reference images

alibaba/happy-horse/text-to-video

Input

Result

What would you like to do next?

Logs

Run Happy Horse 1.0: Text to Video API

About the model

Specifications

Pricing

Prompting tips

Quickstart

Install

Set your API key

Submit a request

Input parameters

Output

Queue API (long-running requests)

Client-side usage

Links

alibaba/happy-horse/text-to-video

Input

Result

What would you like to do next?

Logs

Run Happy Horse 1.0: Text to Video API

About the model

Specifications

Pricing

Prompting tips

Quickstart

Install

Set your API key

Submit a request

Input parameters

Output

Queue API (long-running requests)

Client-side usage

Related models

Links