MiniCPM-V 2.6 Vision
fal-ai/mini-cpm/video
Multimodal vision-language model for video understanding
Inference
Research only
Input
Hint: you can drag and drop file(s) here, or provide a base64 encoded data URL Accepted file types: mp4, mov, webm, m4v, gif
Result
Idle
Loading pricing info...
Logs
Related Models
fal-ai/imageutils/nsfw
vision
Predict the probability of an image being NSFW.
nsfw filter
image classification
safety
fal-ai/llava-next
vision
Vision
vision language model (VLM)
multimodal
vision
fal-ai/florence-2-large/region-to-category
vision
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
vision language model (VLM)
multimodal
vision