Bagel | Text to Image

Readme

Bagel is a 7B parameter multimodal model from ByteDance-Seed that can generate both text and images. This versatile model supports text-to-image generation, image-to-image editing, and image understanding capabilities through an intuitive API.

Key Features

Text-to-Image Generation: Create images from text prompts
Image-to-Image Editing: Transform and edit existing images
Image Understanding: Analyze images and extract structured data (Image-to-JSON)
Multimodal Capabilities: Unified model for both text and image tasks
Cost-Effective: $0.1 per image generation

Getting Started

Getting up and running with Bagel takes just a few minutes. Here's everything you need to start generating content:

First, install your preferred client library:

For JavaScript/TypeScript:

bash
npm install --save @fal-ai/client

For Python:

bash
pip install fal-client

Configure your authentication by setting up your API key:

JavaScript:

javascript
import { fal } from "@fal-ai/client";

fal.config({
  credentials: "YOUR_FAL_KEY_HERE"
});

Python:

python
from fal import client
import os

os.environ["FAL_KEY"] = "YOUR_FAL_KEY_HERE"

API Usage Examples

Text-to-Image Generation

Generate images from text prompts:

JavaScript:

javascript
import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/bagel", {
  input: {
    prompt: "A serene landscape with mountains at sunset"
  }
});

console.log(result.image_url);

Python:

python
from fal import client

result = client.subscribe("fal-ai/bagel", {
    "prompt": "A serene landscape with mountains at sunset"
})

print(result["image_url"])

Image-to-Image Editing

Transform existing images with text prompts:

javascript
const result = await fal.subscribe("fal-ai/bagel/edit", {
  input: {
    image_url: "https://example.com/your-image.jpg",
    prompt: "Transform this into a watercolor painting"
  }
});

Image Understanding (Image-to-JSON)

Extract structured information from images:

javascript
const result = await fal.subscribe("fal-ai/bagel/understand", {
  input: {
    image_url: "https://example.com/your-image.jpg",
    prompt: "Describe the contents of this image"
  }
});

Advanced Usage and Best Practices

Error Handling

javascript
try {
  const result = await fal.subscribe("fal-ai/bagel", {
    input: { prompt: "your prompt" }
  });
} catch (error) {
  console.error("Generation failed:", error.message);
  // Implement appropriate fallback behavior
}

Working with Different Endpoints

Bagel offers three main endpoints:

`fal-ai/bagel` - Text-to-image generation
`fal-ai/bagel/edit` - Image-to-image editing
`fal-ai/bagel/understand` - Image understanding and analysis

File Upload Support

For image inputs, you can either provide URLs or upload files directly:

javascript
import { fal } from "@fal-ai/client";

// Upload a local file
const file = new File([imageData], "image.jpg", { type: "image/jpeg" });
const url = await fal.storage.upload(file);

// Use the uploaded file
const result = await fal.subscribe("fal-ai/bagel/edit", {
  input: {
    image_url: url,
    prompt: "Apply artistic style"
  }
});

Integration Guidelines

When integrating Bagel into your application:

Initialize the client once at your application's entry point
Implement proper error boundaries and fallback states
Consider the multimodal nature when designing user interfaces
Use appropriate endpoints based on your use case
Handle both image and text outputs appropriately

Pricing

Cost: $0.1 per image generation
Pricing applies to all image generation operations (text-to-image and image-to-image)

Model Information

Parameters: 7B active parameters (14B total)
Developer: ByteDance-Seed
Type: Multimodal foundation model
Capabilities: Text generation, image generation, image understanding
Architecture: Based on advanced multimodal transformer architecture

Supported File Formats

For image inputs:

Accepted formats: jpg, jpeg, png, webp, gif, avif

Best Practices

Prompt Engineering: Be descriptive in your prompts for better results
Image Quality: Provide high-quality input images for image-to-image tasks
Rate Limiting: Implement appropriate rate limiting in production
Caching: Cache frequently requested generations when applicable
Multimodal Workflows: Leverage the model's ability to work with both text and images

Troubleshooting

Common Issues and Solutions:

Authentication Errors:

Verify API key is correctly set
Check API key permissions in your fal.ai dashboard
Ensure proper credential initialization

Image Input Issues:

Verify image URLs are publicly accessible
Check supported file formats
Ensure proper encoding for base64 inputs

Generation Quality:

Use detailed, descriptive prompts
For image editing, ensure input image quality is sufficient
Experiment with different prompt phrasings

About Bagel

Bagel represents a significant advancement in multimodal AI, offering unified capabilities for both text and image tasks. Developed by ByteDance-Seed, it demonstrates strong performance across various benchmarks and provides a versatile solution for creative AI applications.

For more information and to explore the model's capabilities, visit the Bagel model page on fal.ai.

Support

For production deployments and additional support:

Visit the fal.ai documentation
Check the fal.ai dashboard for API key management
Explore other models in the fal.ai model gallery

fal-ai/bagel

Input

Result

What would you like to do next?

Logs