> ## Documentation Index
> Fetch the complete documentation index at: https://developer.box.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Extract metadata from file (structured)

export const MultiRelatedLinks = ({sections = []}) => {
  if (!sections || sections.length === 0) {
    return null;
  }
  return <div className="space-y-8">
      {sections.map((section, index) => <RelatedLinks key={index} title={section.title} items={section.items} />)}
    </div>;
};

export const RelatedLinks = ({title, items = []}) => {
  const getBadgeClass = badge => {
    if (!badge) return "badge-default";
    const badgeType = badge.toLowerCase().replace(/\s+/g, "-");
    return `badge-${badge === "ガイド" ? "guide" : badgeType}`;
  };
  if (!items || items.length === 0) {
    return null;
  }
  return <div className="my-8">
      {}
      <h3 className="text-sm font-bold uppercase tracking-wider mb-4">{title}</h3>

      {}
      <div className="flex flex-col gap-3">
        {items.map((item, index) => <a key={index} href={item.href} className="py-2 px-3 rounded related_link hover:bg-[#f2f2f2] dark:hover:bg-[#111827] flex items-center gap-3 group no-underline hover:no-underline border-b-0">
            {}
            <span className={`px-2 py-1 rounded-full text-xs font-semibold uppercase tracking-wide flex-shrink-0 ${getBadgeClass(item.badge)}`}>
              {item.badge}
            </span>

            {}
            <span className="text-base">{item.label}</span>
          </a>)}
      </div>
    </div>;
};

export const Link = ({href, children, className, ...props}) => {
  const [localizedHref, setLocalizedHref] = useState(href);
  const supportedLocales = useMemo(() => ['ja'], []);
  useEffect(() => {
    const getLocaleFromPath = path => {
      const match = path.match(/^\/([a-z]{2})(?:\/|$)/);
      if (match) {
        const potentialLocale = match[1];
        if (supportedLocales.includes(potentialLocale)) {
          return potentialLocale;
        }
      }
      return null;
    };
    const hasLocalePrefix = path => {
      const match = path.match(/^\/([a-z]{2})(?:\/|$)/);
      return match ? supportedLocales.includes(match[1]) : false;
    };
    const currentPath = window.location.pathname;
    const currentLocale = getLocaleFromPath(currentPath);
    if (href && href.startsWith('/') && !hasLocalePrefix(href)) {
      if (currentLocale) {
        setLocalizedHref(`/${currentLocale}${href}`);
      } else {
        setLocalizedHref(href);
      }
    } else {
      setLocalizedHref(href);
    }
  }, [href, supportedLocales]);
  return <a href={localizedHref} className={className} {...props}>
      {children}
    </a>;
};

With Box AI API, you can extract metadata from the provided file
and get the result in the form of key-value pairs.
As input, you can either create a structure using the `fields` parameter, or use an already defined metadata template.
To learn more about creating templates, see [Creating metadata templates in the Admin Console][templates-console] or use the <Link href="/guides/metadata/templates/create">metadata template API</Link>. You can also [autofill metadata in templates][autofill-metadata] using our Standard or Enhanced Extraction Agent.

## Supported file formats

The endpoint supports the following file formats:

* PDF
* TIFF
* PNG
* JPEG

Box AI automatically applies optical character recognition (OCR) when processing image files (TIFF, PNG, JPEG) and scanned documents. This eliminates the need to convert images to PDF before extraction, saving time and simplifying your integration.

## Supported languages

Box AI can extract metadata from documents in the following languages:

* English
* Japanese
* Chinese
* Korean

- Cyrillic-based languages (such as Russian, Ukrainian, Bulgarian, and Serbian)

No additional configuration is required to use different languages or image formats. Box AI automatically detects the language and applies OCR when needed.

## Before you start

Make sure you followed the steps listed in <Link href="/guides/box-ai/ai-tutorials/prerequisites">getting started with Box AI</Link> to create a platform app and authenticate.

## Send a request

To send a request, use the
`POST /2.0/ai/extract_structured` endpoint.

<CodeGroup>
  ```sh cURL theme={null}
  curl -i -L 'https://api.box.com/2.0/ai/extract_structured' \
       -H 'content-type: application/json' \
       -H 'authorization: Bearer <ACCESS_TOKEN>' \
       -d '{
          "items": [
            {
              "id": "12345678",
              "type": "file",
              "content": "This is file content."
            }
          ],
          "metadata_template": {
              "template_key": "",
              "type": "metadata_template",
              "scope": ""
          },
          "fields": [
              {
                "key": "name",
                "description": "The name of the person.",
                "displayName": "Name",
                "prompt": "The name is the first and last name from the email address.",
                "type": "string",
                "options": [
                  {
                    "key": "First Name"
                  },
                  {
                    "key": "Last Name"
                  }
                ]
              }
          ],
          "ai_agent": {
            "type": "ai_agent_extract_structured",
            "long_text": {
              "model": "azure__openai__gpt_4o_mini"
              },
            "basic_text": {
              "model": "azure__openai__gpt_4o_mini"
           }
        }
     }'
  ```

  ```typescript Node/TypeScript v10 theme={null}
  await client.ai.createAiExtractStructured({
    fields: [
      {
        key: 'firstName',
        displayName: 'First name',
        description: 'Person first name',
        prompt: 'What is the your first name?',
        type: 'string',
      } satisfies AiExtractStructuredFieldsField,
      {
        key: 'lastName',
        displayName: 'Last name',
        description: 'Person last name',
        prompt: 'What is the your last name?',
        type: 'string',
      } satisfies AiExtractStructuredFieldsField,
      {
        key: 'dateOfBirth',
        displayName: 'Birth date',
        description: 'Person date of birth',
        prompt: 'What is the date of your birth?',
        type: 'date',
      } satisfies AiExtractStructuredFieldsField,
      {
        key: 'age',
        displayName: 'Age',
        description: 'Person age',
        prompt: 'How old are you?',
        type: 'float',
      } satisfies AiExtractStructuredFieldsField,
      {
        key: 'hobby',
        displayName: 'Hobby',
        description: 'Person hobby',
        prompt: 'What is your hobby?',
        type: 'multiSelect',
        options: [
          { key: 'guitar' } satisfies AiExtractStructuredFieldsOptionsField,
          { key: 'books' } satisfies AiExtractStructuredFieldsOptionsField,
        ],
      } satisfies AiExtractStructuredFieldsField,
    ],
    items: [new AiItemBase({ id: file.id })],
    aiAgent: agentIgnoringOverridingEmbeddingsModel,
  } satisfies AiExtractStructured);
  ```

  ```python Python v10 theme={null}
  client.ai.create_ai_extract_structured(
      [AiItemBase(id=file.id)],
      fields=[
          CreateAiExtractStructuredFields(
              key="firstName",
              display_name="First name",
              description="Person first name",
              prompt="What is the your first name?",
              type="string",
          ),
          CreateAiExtractStructuredFields(
              key="lastName",
              display_name="Last name",
              description="Person last name",
              prompt="What is the your last name?",
              type="string",
          ),
          CreateAiExtractStructuredFields(
              key="dateOfBirth",
              display_name="Birth date",
              description="Person date of birth",
              prompt="What is the date of your birth?",
              type="date",
          ),
          CreateAiExtractStructuredFields(
              key="age",
              display_name="Age",
              description="Person age",
              prompt="How old are you?",
              type="float",
          ),
          CreateAiExtractStructuredFields(
              key="hobby",
              display_name="Hobby",
              description="Person hobby",
              prompt="What is your hobby?",
              type="multiSelect",
              options=[
                  CreateAiExtractStructuredFieldsOptionsField(key="guitar"),
                  CreateAiExtractStructuredFieldsOptionsField(key="books"),
              ],
          ),
      ],
      ai_agent=agent_ignoring_overriding_embeddings_model,
  )
  ```

  ```cs .NET v10 theme={null}
  await client.Ai.CreateAiExtractStructuredAsync(requestBody: new AiExtractStructured(items: Array.AsReadOnly(new [] {new AiItemBase(id: file.Id)})) { Fields = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsField(key: "firstName") { DisplayName = "First name", Description = "Person first name", Prompt = "What is the your first name?", Type = "string" },new AiExtractStructuredFieldsField(key: "lastName") { DisplayName = "Last name", Description = "Person last name", Prompt = "What is the your last name?", Type = "string" },new AiExtractStructuredFieldsField(key: "dateOfBirth") { DisplayName = "Birth date", Description = "Person date of birth", Prompt = "What is the date of your birth?", Type = "date" },new AiExtractStructuredFieldsField(key: "age") { DisplayName = "Age", Description = "Person age", Prompt = "How old are you?", Type = "float" },new AiExtractStructuredFieldsField(key: "hobby") { DisplayName = "Hobby", Description = "Person hobby", Prompt = "What is your hobby?", Type = "multiSelect", Options = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsOptionsField(key: "guitar"),new AiExtractStructuredFieldsOptionsField(key: "books")}) }}) });
  ```

  ```swift Swift v10 theme={null}
  try await client.ai.createAiExtractStructured(requestBody: AiExtractStructured(fields: [AiExtractStructuredFieldsField(key: "firstName", displayName: "First name", description: "Person first name", prompt: "What is the your first name?", type: "string"), AiExtractStructuredFieldsField(key: "lastName", displayName: "Last name", description: "Person last name", prompt: "What is the your last name?", type: "string"), AiExtractStructuredFieldsField(key: "dateOfBirth", displayName: "Birth date", description: "Person date of birth", prompt: "What is the date of your birth?", type: "date"), AiExtractStructuredFieldsField(key: "age", displayName: "Age", description: "Person age", prompt: "How old are you?", type: "float"), AiExtractStructuredFieldsField(key: "hobby", displayName: "Hobby", description: "Person hobby", prompt: "What is your hobby?", type: "multiSelect", options: [AiExtractStructuredFieldsOptionsField(key: "guitar"), AiExtractStructuredFieldsOptionsField(key: "books")])], items: [AiItemBase(id: file.id)]))
  ```

  ```java Java v10 theme={null}
  client.getAi().createAiExtractStructured(new AiExtractStructured.Builder(Arrays.asList(new AiItemBase(file.getId()))).fields(Arrays.asList(new AiExtractStructuredFieldsField.Builder("firstName").description("Person first name").displayName("First name").prompt("What is the your first name?").type("string").build(), new AiExtractStructuredFieldsField.Builder("lastName").description("Person last name").displayName("Last name").prompt("What is the your last name?").type("string").build(), new AiExtractStructuredFieldsField.Builder("dateOfBirth").description("Person date of birth").displayName("Birth date").prompt("What is the date of your birth?").type("date").build(), new AiExtractStructuredFieldsField.Builder("age").description("Person age").displayName("Age").prompt("How old are you?").type("float").build(), new AiExtractStructuredFieldsField.Builder("hobby").description("Person hobby").displayName("Hobby").prompt("What is your hobby?").type("multiSelect").options(Arrays.asList(new AiExtractStructuredFieldsOptionsField("guitar"), new AiExtractStructuredFieldsOptionsField("books"))).build())).aiAgent(agentIgnoringOverridingEmbeddingsModel).build())
  ```

  ```java Java v5 theme={null}
  BoxAIExtractMetadataTemplate template = new BoxAIExtractMetadataTemplate("templateKey", "enterprise");
  BoxAIExtractStructuredResponse result = BoxAI.extractMetadataStructured(
      api,
      Collections.singletonList(new BoxAIItem("123456", BoxAIItem.Type.FILE)),
      template
  );
  JsonObject sourceJson = result.getSourceJson();
  ```
</CodeGroup>

### Parameters

To make a call, you must pass the following parameters. Mandatory parameters are in **bold**.

<Note>
  The `items` array can have exactly one element.
</Note>

| Parameter                            | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Example                                                  |
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------- |
| **`metadata_template`**              | The metadata template containing the fields to extract. For your request to work, you must provide either `metadata_template` or `fields`, but not both.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                          |
| **`metadata_template.type`**         | The type of metadata template.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | `metadata_template`                                      |
| **`metadata_template.scope`**        | The scope of the metadata template that can either be `global` or `enterprise`. Global templates are those available to any Box enterprise, whereas `enterprise` templates are bound to a specific enterprise.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | `metadata_template`                                      |
| **`metadata_template.template_key`** | The name of your metadata template.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | `invoice`                                                |
| **`items.id`**                       | Box file ID of the document. The ID must reference an actual file with an extension.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | `1233039227512`                                          |
| **`items.type`**                     | The type of the supplied input.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | `file`                                                   |
| `items.content`                      | The content of the item, often the text representation.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | `This article is about Box AI`.                          |
| `fields.type`                        | The type of the field. It include but is not limited to `string`, `float`, `date`, `enum`, and `multiSelect`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `string`                                                 |
| `fields.description`                 | A description of the field.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | `The person's name.`                                     |
| `fields.displayName`                 | The display name of the field.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | `Name`                                                   |
| `fields.key`                         | A unique identifier for the field.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | `name`                                                   |
| `fields.options`                     | A list of options for this field. This is most often used in combination with the `enum` and `multiSelect` field types.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | `[{"key":"First Name"},{"key":"Last Name"}]`             |
| `fields.options.key`                 | A unique identifier for the field.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | `First Name`                                             |
| `fields.prompt`                      | Additional context about the key (identifier) that may include how to find and format it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | `Name is the first and last name from the email address` |
| `ai_agent`                           | The AI agent used to override the default agent configuration. This parameter allows you to, for example, replace the default LLM with a custom one using the <Link href="/reference/resources/ai_agent_text_gen#param_basic_gen_model">`model`</Link> parameter, tweak the base <Link href="/reference/resources/ai_agent_text_gen#param_basic_gen_prompt_template">`prompt`</Link> to allow for a more customized user experience, or change an LLM parameter, such as `temperature`, to make the results more or less creative. Before you use the `ai_agent` parameter, you can get the default configuration using the <Link href="/reference/get-ai-agent-default">`GET 2.0/ai_agent_default`</Link> request. For specific use cases, see the <Link href="/guides/box-ai/ai-agents/ai-agent-overrides">AI model overrides tutorial</Link>. |                                                          |

## Use cases

This example shows you how to extract metadata from a sample invoice in a structured way.
Let's assume you want to extract the vendor name, invoice number, and a few more details.

<Frame>
    <img src="https://mintcdn.com/box/KBEcg4yicgc_HMRY/images/guides/box-ai/sample-invoice.png?fit=max&auto=format&n=KBEcg4yicgc_HMRY&q=85&s=e1b68cfc71a4c252f355407447ec0eab" alt="sample invoice" width="2653" height="826" data-path="images/guides/box-ai/sample-invoice.png" />
</Frame>

### Create the request

To get the response from Box AI, call `POST /2.0/ai/extract_structured` endpoint with the following parameters:

* `items.type` and `items.id` to specify the file to extract the data from.
* `fields` to specify the data that you want to extract from the given file.
* `metadata_template` to supply an already existing metadata template.

<Note>
  You can use either `fields` or `metadata_template` to specify your structure, but not both.
</Note>

### Use `fields` parameter

The `fields` parameter allows you to specify the data you want to extract. Each `fields` object has a subset of parameters you can use to add more information about the searched data.
For example, you can add the field type, description, or even a prompt with some additional context.

```bash  theme={null}
curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>'' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "fields": [
        {
            "key": "document_type",
            "type": "enum",
            "prompt": "what type of document is this?",
            "options": [
                {
                    "key": "Invoice"
                },
                {
                    "key": "Purchase Order"
                },
                {
                    "key": "Unknown"
                }
            ]
        },
        {
            "key": "document_date",
            "type": "date"
        },
        {
            "key": "vendor",
            "description": "The name of the entity.",
            "prompt": "Which vendor is sending this document.",
            "type": "string"
        },
        {
            "key": "document_total",
            "type": "float"
        }
    ]
  }'
```

The response lists the specified fields and their values:

```bash  theme={null}
{
    "document_date": "2024-02-13",
    "vendor": "Quasar Innovations",
    "document_total": $1050,
    "document_type": "Purchase Order"
}
```

### Use metadata template

If you prefer to use a metadata template, you can provide its `template_key`, `type`, and `scope`.

```bash  theme={null}
curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "metadata_template": {
        "template_key": "rbInvoicePO",
        "type": "metadata_template",
        "scope": "enterprise_1134207681"
    }
}'
```

The response lists the fields included in the metadata template and their values:

```bash  theme={null}
{
  "documentDate": "February 13, 2024",
  "total": "$1050",
  "documentType": "Purchase Order",
  "vendor": "Quasar Innovations",
  "purchaseOrderNumber": "003"
}
```

### Enhanced Extract Agent

To use the Enhanced Extract Agent, specify the `ai_agent` object as follows:

```bash  theme={null}
{
  "ai_agent": {
    "type": "ai_agent_id", 
    "id": "enhanced_extract_agent"
  }
}
```

To extract data using the Enhanced Extract Agent you need one of the following:

* [Inline field definitions][inline-field] (best when fields change frequently)
* [Metadata template][metadata-template] (best when fields stay consistent)

See the sample code snippet using Box Python SDK:

```Python  theme={null}
from box_sdk_gen import (
    AiAgentReference,
    AiAgentReferenceTypeField,
    AiItemBase,
    AiItemBaseTypeField,
    BoxClient,
    BoxCCGAuth,
    CCGConfig,
    CreateAiExtractStructuredMetadataTemplate
)

# Create your client credentials grant config from the developer console
ccg_config = CCGConfig(
    client_id="my_box_client_id", # replace with your client id
    client_secret="my_box_client_secret", # replace with your client secret
    user_id="my_box_user_id", # replace with the box user id that has access
                              # to the file you are referencing
)
auth = BoxCCGAuth(config=ccg_config)
client = BoxClient(auth=auth)
# Create the agent config referencing the enhanced extract agent
enhanced_extract_agent_config = AiAgentReference(
    id="enhanced_extract_agent",
    type=AiAgentReferenceTypeField.AI_AGENT_ID
)
# Use the Box SDK to call the extract_structured endpoint
box_ai_response = client.ai.create_ai_extract_structured(
    # Create the items array containing the file information to extract from
    items=[
        AiItemBase(
            id="my_box_file_id", # replace with the file id
            type=AiItemBaseTypeField.FILE
        )
    ],
    # Reference the Box Metadata template 
    metadata_template=CreateAiExtractStructuredMetadataTemplate(
        template_key="InvoicePO",
        scope="enterprise"
    ),
    # Attach the agent config you created earlier
    ai_agent=enhanced_extract_agent_config,
)
print(f"box_ai_response: {box_ai_response.answer}")
```

[templates-console]: https://support.box.com/hc/en-us/articles/360044194033-Customizing-Metadata-Templates

[changelog]: https://developer.box.com/changelog

[blog]: https://medium.com/box-developer-blog

[inline-field]: #use-fields-parameter

[metadata-template]: #use-metadata-template

[autofill-metadata]: https://support.box.com/hc/en-us/articles/360044196173-Using-Metadata#h_01JJSRYKDKXHGJT9ZHCW1E9RX5

<RelatedLinks
  title="RELATED APIS"
  items={[
  { label: "Extract metadata (freeform)", href: "/reference/post-ai-extract", badge: "POST" }
]}
/>

<RelatedLinks
  title="RELATED GUIDES"
  items={[
  { label: "Get started with Box AI", href: "/guides/box-ai/ai-tutorials/prerequisites", badge: "GUIDE" },
  { label: "Override AI model configuration", href: "/guides/box-ai/ai-tutorials/default-agent-overrides", badge: "GUIDE" },
  { label: "Generate text with Box AI", href: "/guides/box-ai/ai-tutorials/generate-text", badge: "GUIDE" },
  { label: "Ask questions to Box AI", href: "/guides/box-ai/ai-tutorials/ask-questions", badge: "GUIDE" },
  { label: "Extract metadata from file (freeform)", href: "/guides/box-ai/ai-tutorials/extract-metadata", badge: "GUIDE" }
]}
/>
