# Sa2VA 8B Video

> Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels


## Overview

- **Endpoint**: `https://fal.run/fal-ai/sa2va/8b/video`
- **Model ID**: `fal-ai/sa2va/8b/video`
- **Category**: vision
- **Kind**: inference
**Tags**: multimodal, vision


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:


- **`prompt`** (`string`, _required_):
  Prompt to be used for the chat completion
  - Examples: "Could you please give me a brief description of the video? Please respond with interleaved segmentation masks for the corresponding parts of the answer."

- **`video_url`** (`string`, _required_):
  The URL of the input video.
  - Examples: "https://drive.google.com/uc?id=1iOFYbNITYwrebBBp9kaEGhBndFSRLz8k"

- **`num_frames_to_sample`** (`integer`, _optional_):
  Number of frames to sample from the video. If not provided, all frames are sampled.
  - Range: `1` to `100`


**Required Parameters Example**:

```json
{
  "prompt": "Could you please give me a brief description of the video? Please respond with interleaved segmentation masks for the corresponding parts of the answer.",
  "video_url": "https://drive.google.com/uc?id=1iOFYbNITYwrebBBp9kaEGhBndFSRLz8k"
}
```


### Output Schema

The API returns the following output format:

- **`output`** (`string`, _required_):
  Generated output
  - Examples: "<p>  Two children  </p>   [SEG]  are jumping on  <p>  a bed  </p>   [SEG]  .<|im_end|>"

- **`masks`** (`list<File>`, _required_):
  Dictionary of label: mask video
  - Array of File
  - Examples: [{"file_size":3259012,"file_name":"output_0.mp4","content_type":"application/octet-stream","url":"https://v3.fal.media/files/kangaroo/KSuUWm24leGew4jTouuTM_output_0.mp4"},{"file_size":1241471,"file_name":"output_1.mp4","content_type":"application/octet-stream","url":"https://v3.fal.media/files/monkey/0jHCYm2lZM6FjDmtXw1Kt_output_1.mp4"}]


**Example Response**:

```json
{
  "output": "<p>  Two children  </p>   [SEG]  are jumping on  <p>  a bed  </p>   [SEG]  .<|im_end|>",
  "masks": [
    {
      "file_size": 3259012,
      "file_name": "output_0.mp4",
      "content_type": "application/octet-stream",
      "url": "https://v3.fal.media/files/kangaroo/KSuUWm24leGew4jTouuTM_output_0.mp4"
    },
    {
      "file_size": 1241471,
      "file_name": "output_1.mp4",
      "content_type": "application/octet-stream",
      "url": "https://v3.fal.media/files/monkey/0jHCYm2lZM6FjDmtXw1Kt_output_1.mp4"
    }
  ]
}
```


## Usage Examples

### cURL

```bash
curl --request POST \
  --url https://fal.run/fal-ai/sa2va/8b/video \
  --header "Authorization: Key $FAL_KEY" \
  --header "Content-Type: application/json" \
  --data '{
     "prompt": "Could you please give me a brief description of the video? Please respond with interleaved segmentation masks for the corresponding parts of the answer.",
     "video_url": "https://drive.google.com/uc?id=1iOFYbNITYwrebBBp9kaEGhBndFSRLz8k"
   }'
```

### Python

Ensure you have the Python client installed:

```bash
pip install fal-client
```

Then use the API client to make requests:

```python
import fal_client

def on_queue_update(update):
    if isinstance(update, fal_client.InProgress):
        for log in update.logs:
           print(log["message"])

result = fal_client.subscribe(
    "fal-ai/sa2va/8b/video",
    arguments={
        "prompt": "Could you please give me a brief description of the video? Please respond with interleaved segmentation masks for the corresponding parts of the answer.",
        "video_url": "https://drive.google.com/uc?id=1iOFYbNITYwrebBBp9kaEGhBndFSRLz8k"
    },
    with_logs=True,
    on_queue_update=on_queue_update,
)
print(result)
```

### JavaScript

Ensure you have the JavaScript client installed:

```bash
npm install --save @fal-ai/client
```

Then use the API client to make requests:

```javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/sa2va/8b/video", {
  input: {
    prompt: "Could you please give me a brief description of the video? Please respond with interleaved segmentation masks for the corresponding parts of the answer.",
    video_url: "https://drive.google.com/uc?id=1iOFYbNITYwrebBBp9kaEGhBndFSRLz8k"
  },
  logs: true,
  onQueueUpdate: (update) => {
    if (update.status === "IN_PROGRESS") {
      update.logs.map((log) => log.message).forEach(console.log);
    }
  },
});
console.log(result.data);
console.log(result.requestId);
```


## Additional Resources

### Documentation

- [Model Playground](https://fal.ai/models/fal-ai/sa2va/8b/video)
- [API Documentation](https://fal.ai/models/fal-ai/sa2va/8b/video/api)
- [OpenAPI Schema](https://fal.ai/api/openapi/queue/openapi.json?endpoint_id=fal-ai/sa2va/8b/video)

### fal.ai Platform

- [Platform Documentation](https://docs.fal.ai)
- [Python Client](https://docs.fal.ai/clients/python)
- [JavaScript Client](https://docs.fal.ai/clients/javascript)