Sa2VA 8B Video Vision

fal-ai/sa2va/8b/video
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle
<p>  A white pickup truck  </p>   [SEG]  is parked on the side of  <p>  the red building  </p>   [SEG] , creating a unique and eye-catching contrast.<|im_end|>

Loading pricing info...

Logs