fal-ai/florence-2-large/caption-to-phrase-grounding

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks

Inference

Commercial use

Schema

LLMs

Playground API

Input

Image Url*

Hint: Drag and drop image files from your computer, images from web pages, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted file types: jpg, jpeg, png, webp, gif, avif

Text Input*

Result

Idle

Waiting for your input...

What would you like to do next?

Your request will cost $0 per compute second.

fal-ai/florence-2-large/caption-to-phrase-grounding

Input

Result

What would you like to do next?

Logs