Sa2VA 4B Image Vision

fal-ai/sa2va/4b/image
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
Inference
Commercial use

Input

Result

Idle
<p>  A white pickup truck  </p>   [SEG]  is parked on the side of  <p>  the red building  </p>   [SEG] , creating a unique and eye-catching contrast.<|im_end|>

Loading pricing info...

Logs