BatchSpanProcessor are silently dropped when a runner shuts down if they are not flushed before SIGKILL arrives. For basic tracing setup, start with Custom Traces.
Conditional Tracing
Disable tracing entirely with an environment variable when you do not need it. When noTracerProvider is configured, the OpenTelemetry SDK returns a no-op tracer. All start_as_current_span calls become zero-cost no-ops and no export threads are started.
Python
ENABLE_TRACING=false as a fal secret to turn off tracing without redeploying:
Sampling
Sampling reduces the fraction of traces exported. Use it when your app handles many requests per second and exporting every trace would exceed your backend’s ingestion limits or add measurable latency.ParentBased(root=TraceIdRatioBased(rate)) is the recommended sampler for services that receive calls from other services. It honours the sampling decision made upstream (so a trace that was sampled by a caller is always sampled here too), and applies ratio-based sampling only to new root spans.
Python
OTEL_SAMPLE_RATE=0.1 under heavy load and increase it if you need more coverage for debugging.
Batch Export Tuning
BatchSpanProcessor buffers spans in memory and exports them in bulk. Tune these parameters if you see dropped spans under burst traffic or export timeouts in your backend logs.
Python
| Parameter | Default | When to increase |
|---|---|---|
max_queue_size | 2048 | Seeing dropped spans under burst traffic |
max_export_batch_size | 512 | High span volume with low per-request overhead |
schedule_delay_millis | 5000 | Backend is overwhelmed by frequent export requests |
export_timeout_millis | 30000 | Backend is slow or on a high-latency network |
Flushing on Shutdown
fal gives runners a 5-second grace period betweenSIGTERM and SIGKILL. teardown() runs during that window, after all in-flight requests finish. See App Lifecycle for the full shutdown sequence.
Call force_flush() in teardown(), not in your endpoint handler. Per-request flushing adds latency on every call; BatchSpanProcessor handles periodic exports on its own schedule.
Python
What’s Next
Custom Traces
Add spans to trace inference stages end to end
Custom Metrics
Counters, histograms, and gauges for your inference workload
Cross-Service Tracing
Connect traces across multiple fal apps
App Lifecycle
Full shutdown sequence and grace period details