Bagel Text to Image

fal-ai/bagel
Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images.
Inference
Commercial use

Input

Additional Settings

Customize your input with more control.

Result

Idle

What would you like to do next?

Your request will cost $0.1 per image.

Logs

Readme

Bagel is a 7B parameter multimodal model from ByteDance-Seed that can generate both text and images. This versatile model supports text-to-image generation, image-to-image editing, and image understanding capabilities through an intuitive API.

Key Features

  • Text-to-Image Generation: Create images from text prompts
  • Image-to-Image Editing: Transform and edit existing images
  • Image Understanding: Analyze images and extract structured data (Image-to-JSON)
  • Multimodal Capabilities: Unified model for both text and image tasks
  • Cost-Effective: $0.1 per image generation

Getting Started

Getting up and running with Bagel takes just a few minutes. Here's everything you need to start generating content:

First, install your preferred client library:

For JavaScript/TypeScript:

npm install --save @fal-ai/client

For Python:

pip install fal-client

Configure your authentication by setting up your API key:

JavaScript:

import { fal } from "@fal-ai/client";

fal.config({
  credentials: "YOUR_FAL_KEY_HERE"
});

Python:

from fal import client
import os

os.environ["FAL_KEY"] = "YOUR_FAL_KEY_HERE"

API Usage Examples

Text-to-Image Generation

Generate images from text prompts:

JavaScript:

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/bagel", {
  input: {
    prompt: "A serene landscape with mountains at sunset"
  }
});

console.log(result.image_url);

Python:

from fal import client

result = client.subscribe("fal-ai/bagel", {
    "prompt": "A serene landscape with mountains at sunset"
})

print(result["image_url"])

Image-to-Image Editing

Transform existing images with text prompts:

const result = await fal.subscribe("fal-ai/bagel/edit", {
  input: {
    image_url: "https://example.com/your-image.jpg",
    prompt: "Transform this into a watercolor painting"
  }
});

Image Understanding (Image-to-JSON)

Extract structured information from images:

const result = await fal.subscribe("fal-ai/bagel/understand", {
  input: {
    image_url: "https://example.com/your-image.jpg",
    prompt: "Describe the contents of this image"
  }
});

Advanced Usage and Best Practices

Error Handling

try {
  const result = await fal.subscribe("fal-ai/bagel", {
    input: { prompt: "your prompt" }
  });
} catch (error) {
  console.error("Generation failed:", error.message);
  // Implement appropriate fallback behavior
}

Working with Different Endpoints

Bagel offers three main endpoints:

  • fal-ai/bagel - Text-to-image generation
  • fal-ai/bagel/edit - Image-to-image editing
  • fal-ai/bagel/understand - Image understanding and analysis

File Upload Support

For image inputs, you can either provide URLs or upload files directly:

import { fal } from "@fal-ai/client";

// Upload a local file
const file = new File([imageData], "image.jpg", { type: "image/jpeg" });
const url = await fal.storage.upload(file);

// Use the uploaded file
const result = await fal.subscribe("fal-ai/bagel/edit", {
  input: {
    image_url: url,
    prompt: "Apply artistic style"
  }
});

Integration Guidelines

When integrating Bagel into your application:

  1. Initialize the client once at your application's entry point
  2. Implement proper error boundaries and fallback states
  3. Consider the multimodal nature when designing user interfaces
  4. Use appropriate endpoints based on your use case
  5. Handle both image and text outputs appropriately

Pricing

  • Cost: $0.1 per image generation
  • Pricing applies to all image generation operations (text-to-image and image-to-image)

Model Information

  • Parameters: 7B active parameters (14B total)
  • Developer: ByteDance-Seed
  • Type: Multimodal foundation model
  • Capabilities: Text generation, image generation, image understanding
  • Architecture: Based on advanced multimodal transformer architecture

Supported File Formats

For image inputs:

  • Accepted formats: jpg, jpeg, png, webp, gif, avif

Best Practices

  1. Prompt Engineering: Be descriptive in your prompts for better results
  2. Image Quality: Provide high-quality input images for image-to-image tasks
  3. Rate Limiting: Implement appropriate rate limiting in production
  4. Caching: Cache frequently requested generations when applicable
  5. Multimodal Workflows: Leverage the model's ability to work with both text and images

Troubleshooting

Common Issues and Solutions:

Authentication Errors:

  • Verify API key is correctly set
  • Check API key permissions in your fal.ai dashboard
  • Ensure proper credential initialization

Image Input Issues:

  • Verify image URLs are publicly accessible
  • Check supported file formats
  • Ensure proper encoding for base64 inputs

Generation Quality:

  • Use detailed, descriptive prompts
  • For image editing, ensure input image quality is sufficient
  • Experiment with different prompt phrasings

About Bagel

Bagel represents a significant advancement in multimodal AI, offering unified capabilities for both text and image tasks. Developed by ByteDance-Seed, it demonstrates strong performance across various benchmarks and provides a versatile solution for creative AI applications.

For more information and to explore the model's capabilities, visit the Bagel model page on fal.ai.

Support

For production deployments and additional support: