stream() method in the fal client SDKs.
Under the hood, streaming uses Server-Sent Events (SSE), a one-way protocol where the server pushes events to the client over a single HTTP connection. You define a streaming endpoint by returning a FastAPI StreamingResponse with SSE-formatted events from an @fal.endpoint("/stream") method. For bidirectional communication where clients send multiple inputs over a persistent connection, see Realtime Endpoints instead.
Streaming vs Realtime
Streaming and realtime endpoints serve different interaction patterns. Streaming is one-way (server to client) and suited for progressive output from a single request. Realtime is bidirectional (client and server) and suited for interactive applications with back-to-back requests over a persistent WebSocket connection.| Feature | Streaming (SSE) | Realtime (WebSocket) |
|---|---|---|
| Direction | One-way (server to client) | Bidirectional |
| Connection | New connection per request | Persistent, reusable |
| Best for | Progressive output, previews | Interactive apps, back-to-back requests |
| Protocol | JSON over SSE | Binary msgpack |
Example: Streaming Intermediate Steps with SDXL
This example shows how to stream intermediate image previews during Stable Diffusion XL generation. It uses a TinyVAE for fast preview decoding and the pipeline’s callback system to capture progress at each step.Example Details
This example usesmadebyollin/taesdxl, a TinyVAE that decodes intermediate latents roughly 10x faster than the full VAE. The diffusers callback_on_step_end hook captures latents at each denoising step, but the callback only streams every 5 steps to balance responsiveness with overhead. A thread-safe queue passes events from the pipeline thread (which runs inference) to the streaming generator (which yields SSE events to the client).
Client-Side Usage
Endpoint Path RequirementThe
fal_client.stream() (Python) and fal.stream() (JavaScript) functions automatically append /stream to your endpoint ID. This means your app must define a streaming endpoint at /stream using @fal.endpoint("/stream").For example, calling fal_client.stream("your-username/your-app-name", ...) will connect to https://fal.run/your-username/your-app-name/stream.Key Points
Your streaming endpoint must return a FastAPIStreamingResponse with media_type="text/event-stream". Each event is yielded as f"data: {json.dumps(payload)}\n\n" (note the double newline, which is part of the SSE spec). For images to display in the Playground, include both url and content_type in the event payload. Throttle your streaming to avoid sending every intermediate result, and use lower quality or resolution for previews to save bandwidth.
Next Steps
Realtime Endpoints
Bidirectional WebSocket communication for interactive apps
3D Progressive Rendering
Stream voxel data in real-time during 3D diffusion
Distributed Streaming
Stream results from multi-GPU workers