Extract metadata from file (freeform)

Box AI API allows you to query a document and extract metadata based on a provided prompt. Freeform means that the prompt can include a stringified version of formats such as JSON or XML, or even plain text.

The Extract metadata (freeform) endpoint doesn’t support OCR. To extract metadata from image files (TIFF, PNG, JPEG) or documents in languages other than English, use the endpoint.

Before you start

Make sure you followed the steps listed in to create a platform app and authenticate.

Send a request

To send a request, use the POST /2.0/ai/extract endpoint.

curl -i -L 'https://api.box.com/2.0/ai/extract' \
     -H 'content-type: application/json' \
     -H 'authorization: Bearer <ACCESS_TOKEN>' \
     -d '{
        "prompt": "Extract data related to contract conditions",
        "items": [
              {
                  "type": "file",
                  "id": "1497741268097"
              }
        ],
        "ai_agent": {
          "type": "ai_agent_extract",
          "long_text": {
            "model": "azure__openai__gpt_4o_mini",
            "prompt_template": "It is `{current_date}`, and I have $8000 and want to spend a week in the Azores. What should I see?",
          },
          "basic_text": {
            "model": "azure__openai__gpt_4o_mini",
          }
        }
      }'

Parameters

To make a call, you must pass the following parameters. Mandatory parameters are in bold. The items array must contain exactly one element. For prompt and file limits, see .

Parameter	Description	Example
`prompt`	The request for Box AI to extract metadata. Maximum 10,000 characters.	Create a meeting agenda for a weekly sales meeting.
`items.id`	Box file ID of the document. The ID must reference an actual file with an extension.	`1233039227512`
`items.type`	The type of the supplied input.	`file`
`items.content`	The content of the item, often the text representation.	`This article is about Box AI`.
`ai_agent`	Override the default model configuration. Lets you change the model, prompt template, system message, or LLM parameters. See for how it works and for examples.

Use cases

This example shows you how to extract metadata from a sample invoice.

Create the request

To get the response from Box AI, call POST /2.0/ai/extract endpoint with the following parameters:

prompt that can be a query, or a structured or unstructured list of fields to extract.
type and id of the file to extract the data from.

Create the prompt

Depending on the use case and the level of detail, you can construct various prompts.

Use plain text

Because this endpoint allows freeform prompts, you can use plain text to get the information.

curl --location 'https://api.box.com/2.0/ai/extract' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>' \
--data '{
    "prompt": "find the document type (invoice or po), vendor, total, and po number",
    "items": [
        {
            "type": "file",
            "id": "1443721424754"
        }
    ]
}'

In such a case, the response will be based on the keywords included in the text:

{
    "answer": "{\"Document Type\": \"Invoice\", \"Vendor\": \"Quasar Innovations\", \"Total\": \"$1,050\", \"PO Number\": \"003\"}",
    "created_at": "2024-05-31T10:30:51.223-07:00",
    "completion_reason": "done"
}

Use specific terms

If you don’t want to write the entire sentence, the prompt can consist of terms that you expect to find in an invoice:

curl --location 'https://api.box.com/2.0/ai/extract' \
--header 'Content-Type: application/json' \
--header 'Authorization: <ACCESS_TOKEN>' \
--data '{
    "prompt": "{\"vendor\",\"total\",\"doctype\",\"date\",\"PO\"}",
    "items": [
        {
            "type": "file",
            "id": "1443721424754"
        }
    ]
}'

Using this approach results in a list of terms provided in the request and their values:

{
    "answer": "{\"vendor\": \"Quasar Innovations\", \"total\": \"$1,050\", \"doctype\": \"Invoice\", \"PO\": \"003\"}",
    "created_at": "2024-05-31T10:28:51.906-07:00",
    "completion_reason": "done"
}

Use key-value pairs

The prompt can also be a list of key-value pairs that helps Box AI to come up with the metadata structure. This approach requires listing the key-value pairs within a fields array.

curl --location 'https://api.box.com/2.0/ai/extract' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>' \
--data '{
          "prompt": "{\"fields\":   [{\"key\":\"vendor\",\"displayName\":\"Vendor\",\"type\":\"string\",\"description\":\ "Vendorname\"},{\"key\":\"documentType\",\"displayName\":\"Type\",\"type\":\"string\",\"description\":\"\"}]}",
    "items": [
        {
            "type": "file",
            "id": "1443721424754"
        }
    ]
}'

The response includes the fields present in the file, along with their values:

{
    "answer": "{\"vendor\": \"Quasar Innovations\", \"documentType\": \"Invoice\"}",
    "created_at": "2024-05-31T10:15:38.17-07:00",
    "completion_reason": "done"
}

​Before you start

​Send a request

​Parameters

​Use cases

​Create the request

​Create the prompt

​Use plain text

​Use specific terms

​Use key-value pairs

Before you start

Send a request

Parameters

Use cases

Create the request

Create the prompt

Use plain text

Use specific terms

Use key-value pairs