API Documentation

Image to Text Endpoint

This endpoint allows you to submit an image and receive extracted text based on your specified parameters. It supports various output types, custom extraction instructions, and structured data parsing.

Route

POST /api/image-to-text

Authentication

Include your API key in the request headers as a Bearer token:


Authorization: Bearer YOUR_API_KEY
                

Replace YOUR_API_KEY with your actual API key. Get this API key on your dashboard

Complete Workflow Example

Here's a complete example showing how to upload an image and then process it:

// Complete workflow: Upload image and extract text
const processImageFile = async (file, apiKey) => {
    try {
        // Step 1: Get upload URL
        const uploadUrlResponse = await fetch(`https://img2txt.io/api/get-upload-url?name=${file.name}&size=${file.size}`, {
            method: 'GET',
            headers: {
                'Authorization': `Bearer ${apiKey}`
            }
        });
        
        const uploadData = await uploadUrlResponse.json();
        if (!uploadData.success) {
            throw new Error(uploadData.message);
        }
        
        // Step 2: Upload the file
        const formData = new FormData();
        formData.append("file", file);

        const uploadFileResponse = await fetch(uploadData.url, {
            method: "PUT",
            body: formData,
        });
        
        if (!uploadResponse.ok) {
            throw new Error('Failed to upload file');
        }

        const uploadResult = await uploadFileResponse.json();

        if (!uploadResult.success) {
            throw new Error(uploadResult.message);
        }
        
        // Step 3: Process the image
        const processResponse = await fetch('https://img2txt.io/api/image-to-text', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': `Bearer ${apiKey}`
            },
            body: JSON.stringify({
                imageUrl: uploadResult.ufsUrl,
                outputType: 'raw'
            })
        });
        
        const result = await processResponse.json();
        
        if (result.success) {
            console.log('Extracted text:', result.text_output);
            return result;
        } else {
            throw new Error(result.message);
        }
        
    } catch (error) {
        console.error('Workflow failed:', error);
        throw error;
    }
};

// Usage example with file input
document.getElementById('fileInput').addEventListener('change', async (event) => {
    const file = event.target.files[0];
    if (file) {
        try {
            const result = await processImageFile(file, 'YOUR_API_KEY');
            console.log('Processing complete:', result);
        } catch (error) {
            console.error('Error:', error.message);
        }
    }
});
                

Body Parameters

The request body should be in JSON format.

Parameter Type Required Default Options Description
outputType String Yes None 'raw'
'description'
'structured'
'description,structured'
Specifies the desired output format(s). Multiple values can be comma-separated.
  • raw: Raw text extracted directly from the image.
  • description: Describe exactly what you want from the image and how you want it structured.
  • structured: Text extracted and formatted according to the outputStructure JSON schema.
imageUrl String Yes None Valid Uploaded Image URL Must be a valid URL pointing to an image stored under our image storage system (e.g., "https://xxxxxxxxxx.ufs.sh/f/xxxxxxxxxx").
description String No None None An optional prompt or specific instructions on how the text should be extracted or what the AI should focus on. Particularly useful for outputType: 'description' or to guide 'structured' extraction for ambiguous images.
outputStructure JSON String No (but required if outputType includes 'structured') None Valid JSON object as a string A JSON representation (as a string) defining the schema for structured data extraction. The AI will attempt to populate this structure. See "Output Structure Generator" below.

Example Request (JavaScript Fetch)

Here's an example of how to call the API using JavaScript Fetch:

// Ensure to replace 'YOUR_API_KEY' and use a valid Vercel Blob URL
const apiUrl = 'https://img2txt.io/api/image-to-text';
const apiKey = 'YOUR_API_KEY'; // Your API key

//Usually you would get this URL from an upload URL, then the result from uploading it. For this example, we will use a direct URL.
const imageUrl = 'https://xxxxxxxxxx.ufs.sh/f/xxxxxxxxxx';

const requestBody = {
    outputType: 'description,structured',
    imageUrl: imageUrl, // Image URL
    description: 'Extract the vendor name, purchase date, and total amount from this receipt.',
    outputStructure: JSON.stringify({ // Remember to stringify the JSON for outputStructure
        "venderName": "", // This is an empty string because the AI will fill it in
        "purchaseDate": "",
        "totalAmount": 0 // This is a number, so we initialize it to 0
    })
};

fetch(apiUrl, {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${apiKey}` // Include API key
    },
    body: JSON.stringify(requestBody) // Stringify the entire request body
})
.then(async response => {
    if (!response.ok) {
        // Try to get error message from response body for more details
        const errorData = await response.json().catch(() => ({ message: 'Failed to parse error response.' }));
        const errorMessage = errorData.message || errorData.error?.message || `HTTP error! Status: ${response.status}`;
        throw new Error(errorMessage);
    }
    return response.json();
})
.then(data => {
    console.log('API Success:', data);
    // Example of processing data:
    if (data.text_output) {
        console.log('Extracted Data:', data.text_output);
    }
})
.catch((error) => {
    console.error('API Error:', error.message);
});
                

Example Response

The structure of the response will depend on the outputType requested. The extracted text will be returned in the text_output field.

// Example successful response for outputType: 'description,structured'
{
    "text_output": {
        "venderName": "Example Vendor",
        "purchaseDate": "2025-01-01",
        "totalAmount": 123.45
    },
    "image_url": "https://xxxxxxxxxx.ufs.sh/f/xxxxxxxxxx",
    "job_id": "4fbb658b-afca-4e0f-9207-e043af6c80cc",
    "created_at": 1640995200,
    "creditsLeft": 0.77,
    "success": true,
    "message": "Image processed successfully."
}

// Example error response
{
    "success": false,
    "message": "Invalid image URL. This property is required."
}
                

Output Structure Generator

If using outputType: 'structured', you must provide a JSON schema for the outputStructure parameter. This schema guides the AI on what information to extract and its desired format. Use the editor below to create your JSON structure. The value of outputStructure in your API call must be this JSON object, stringified.

This is the same editor/output used within the Img2Txt Dashboard!

Start by adding a root property.

Tips for creating outputStructure schema:

Example Schema for Extracting Invoice Details:

{
    "invoiceId": "",
    "issueDate": "",
    "dueDate": "",
    "vendorDetails": {
        "name": "",
        "address": "",
        "phone": ""
    },
    "customerDetails": {
        "name": "",
        "billingAddress": "",
        "shippingAddress": ""
    },
    "itemCosts": [
        0,
        0,
        0,
        0
    ],
    "payment": {
        "total": 0,
        "isTaxable": false,
        "discountAmount": 0
    }
}