Skip to main content

Documentation Index

Fetch the complete documentation index at: https://fal.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Run ComfyUI workflows on fal serverless GPUs while editing them visually from your local machine. Two pre-built packages handle setup so you don’t need to write any code:
  • User package — open the ComfyUI canvas and run workflows on a deployed fal endpoint.
  • Admin package — add models, custom nodes, and Python deps in a small web UI, then deploy to fal in one click.
Pick the visual editor for prototyping and creative iteration. When you’re ready to ship the workflow as an HTTP endpoint, jump to Deploy a ComfyUI SDXL Turbo App for the production code-first flow.
Admin web UI showing the Models tab — folders, filenames and download URLs in a table

How it works

The proxy implements ComfyUI’s full API locally. Each prompt is streamed to a fal runner — progress, previews, and outputs come back in real time. Nothing about the canvas or upload flow leaves your machine until you queue a prompt.

For users — open the canvas in three steps

1

Download

Get the user package zip from your team lead, or directly if they’ve shared the link with you. Unzip it anywhere.The unzipped folder contains a README.md, a launcher (ComfyUI.command on macOS, ComfyUI.bat on Windows), and an app/ folder with the internals.
2

Double-click the launcher

  • macOS — right-click ComfyUI.commandOpenOpen in the Gatekeeper dialog (only on first launch).
  • Windows — double-click ComfyUI.bat. If SmartScreen pops up, choose More infoRun anyway. Requires Python 3.11 — check “Add Python to PATH” during install.
First run installs everything automatically (a tool called uv, Python 3.11, the proxy dependencies). Takes 1–3 minutes. Subsequent launches are instant.
3

Paste your fal API key + endpoint

A browser tab opens on http://localhost:9998. Enter:
  • fal API key — from fal.ai/dashboard/keys.
  • Endpoint — the slug your admin gave you, in the form <team>/<app> (e.g. acme/comfy).
Click Start ComfyUI. A second tab opens on http://localhost:8188 with the full ComfyUI canvas. Build, queue, and iterate as usual — every prompt runs on the fal GPU.
The first prompt of the day takes 5–10 minutes to warm up — the runner downloads model weights and installs custom nodes. After that, prompts run in seconds while the runner stays warm. “Reconnecting” toasts that occasionally flash in ComfyUI are cosmetic.

For admins — configure and deploy

Team leads use the admin package to control which models, custom nodes, and Python packages live in the deployed runtime.
1

Download the admin package

Get the admin package zip and unzip. Double-click admin-ui.command (macOS) or admin-ui.bat (Windows). The browser opens on http://localhost:9999.
2

Settings — credentials and tokens

Open the Settings tab and fill in:
  • fal API key — generate at fal.ai/dashboard/keys after switching to the right team in the dashboard. The team is encoded in the key, so creating it on the wrong team deploys to the wrong place.
  • App name — the slug for your deployment (e.g. comfy).
  • HuggingFace token (optional) — needed for gated models such as Flux 2 Klein. Saved to .env and pushed to fal as a runner secret automatically at deploy time.
  • Civitai token (optional) — same idea, for Civitai-hosted weights.
The top-right header shows Deploying as <team> so you can verify the key targets the right account before pushing anything.
3

Edit Models, Custom Nodes, Python Deps

Three tabs in the admin UI mirror the structure of the deployed runtime:
  • Models — folder + filename + download URL. Sub-folders are supported (klein/my-lora.safetensors saves into loras/klein/).
  • Custom Nodes — paste a GitHub URL and pin a commit. The UI fetches the repo’s requirements.txt so you can confirm extra Python deps before saving.
  • Python Deps — direct editor over comfy_server/requirements.txt.
Add a fal-trained LoRA the same way: paste its .safetensors URL (e.g. https://v3b.fal.media/...) and set folder to loras.
4

Click Deploy

The Deploy button in the top bar runs fal deploy end-to-end and streams logs to the bottom drawer. Wait for [DEPLOY SUCCESSFUL].Share the resulting <team>/<app> slug with your users along with the user zip from the section above.

What the admin UI gives you on top of fal deploy

  • Live runner view — chip in the header lists current GPU runners for your app, with one-click log streaming and termination. Pulsing green dot when at least one runner is RUNNING.
  • Outputs gallery — every image and video produced by the proxy, with per-file download and a “Download all” zip. Filenames are auto-disambiguated when multiple workflows produce the same base name (video.mp4video_2.mp4 → …).
  • Run ComfyUI — starts the bundled local proxy and opens the canvas pointing at your deployed app. The button shows the live runner state (cold-starting, idle · ready, processing).
  • Theme + ComfyUI commit pin — dark/light toggle, plus a clickable chip to pin a specific upstream ComfyUI commit (validated against comfyanonymous/ComfyUI).
  • Undeploy — confirmation-gated button under Settings → Danger zone if you ever want to retire an app.

Multi-user concurrency

Each user runs the proxy locally, so there’s no cross-machine state to worry about. fal itself routes incoming prompts to whichever runner is free, capped by your app’s max_concurrency. Tune it once via the CLI:
fal apps scale <app> --min-concurrency 1 --max-concurrency 5 --max-multiplexing 1
SettingEffect
max_concurrency = 1Only one runner ever — second user’s prompt queues. Cheapest.
max_concurrency = NUp to N runners in parallel. Each cold-starts the first time.
min_concurrency = 1Keep one runner always warm — no cold-start for the first user.
max_multiplexing = 1One prompt per runner at a time. Recommended for GPU-heavy workflows.

When to use this vs. the SDXL deploy guide

ScenarioUse this pageUse Deploy SDXL Turbo
Iterating on a workflow visually
Sharing a creative tool with a non-technical team
Wrapping a fixed workflow as a REST API for production
Custom request/response schemas (Pydantic)
Fine control over fal.App lifecycle, secrets, multi-GPU
The visual editor is the right starting point. When the workflow is locked and you need a cleanly typed endpoint, the SDXL Turbo guide shows the code-first flow.

Troubleshooting

IssueFix
macOS: “can’t be opened, unidentified developer”Right-click .commandOpenOpen.
Windows: SmartScreen blocks the .batMore infoRun anyway.
Launcher fails to install uvInstall it manually: curl -LsSf https://astral.sh/uv/install.sh | sh (macOS) or irm https://astral.sh/uv/install.ps1 | iex in PowerShell (Windows). Re-launch.
Deploy fails with 401 Unauthorized on a HuggingFace URLThe model is gated. Set the HuggingFace token in admin Settings and re-deploy — the admin UI pushes it to fal secrets automatically.
Runner stuck in SETUP for several minutesNormal on first cold start while weights download (5–10 min).
Two runners spawn for one userLower max_concurrency: fal apps scale <app> --max-concurrency 1.
Port already in use on :9999, :9998, or :8188The launcher auto-kills stale instances on next start. If something else is binding the port, close it before relaunching.
Stop everythingClose the launcher terminal window. The proxy and admin UI shut down with it.

Uninstall

Delete the unzipped folder. Everything is self-contained — local Python venv, models cache, env file, logs. Nothing is installed globally.