Box Developer Documentation
Beta

Extract metadata (structured)

post
https://api.box.com/2.0
/ai/extract_structured

This endpoint is in the version 2024.0. No changes are required to continue using it. For more details, see Box API versioning.

Sends an AI request to supported Large Language Models (LLMs) and returns extracted metadata as a set of key-value pairs. For this request, you either need a metadata template or a list of fields you want to extract. Input is either a metadata template or a list of fields to ensure the structure. To learn more about creating templates, see Creating metadata templates in the Admin Console or use the metadata template API.

Request

bearer [ACCESS_TOKEN]
application/json

Request Body

The AI agent to be used for structured extraction.

object arrayin bodyoptional

The fields to be extracted from the provided items. For your request to work, you must provide either metadata_template or fields, but not both.

stringin bodyoptional
"enum"

The type of the field. It include but is not limited to string, float, date, enum, and multiSelect.

stringin bodyoptional
"The name of the person."

A description of the field.

stringin bodyoptional
"Name"

The display name of the field.

stringin bodyconditionally required
"name"

A unique identifier for the field.

object arrayin bodyoptional
[{"key":"First Name"},{"key":"Last Name"}]

A list of options for this field. This is most often used in combination with the enum and multiSelect field types.

stringin bodyconditionally required
"First Name"

A unique identifier for the field.

stringin bodyoptional
"Name is the first and last name from the email address"

The context about the key that may include how to find and format it.

object arrayin bodyrequired

The items to be processed by the LLM. Currently you can use files only.

stringin bodyrequired
"123"

The ID of the file.

stringin bodyrequired
"file"

The type of the item. Currently the value can be file only.

Value is always file

stringin bodyoptional
"This is file content."

The content of the item, often the text representation.

objectin body

The metadata template containing the fields to extract. For your request to work, you must provide either metadata_template or fields, but not both.

stringin bodyoptional
"metadata_template"

Value is always metadata_template.

Value is always metadata_template

stringin bodyoptional
"enterprise_12345"
40

The scope of the metadata template that can either be global or enterprise.

  • The global scope is used for templates that are available to any Box enterprise.
  • The enterprise scope represents templates created within a specific enterprise, containing the ID of that enterprise.
stringin bodyoptional
"invoiceTemplate"

The name of the metadata template.

Response

application/jsonAI response

A successful response including the answer from the LLM.

application/jsonClient error

An unexpected server error.

application/jsonClient error

An unexpected error.

post
Extract metadata (structured)
You can now try out some of our APIs live, right here in the documentation.
Log in

Request Example

cURL
curl -i -L 'https://api.box.com/2.0/ai/extract_structured' \
     -H 'content-type: application/json' \
     -H 'authorization: Bearer <ACCESS_TOKEN>' \
     -d '{
        "items": [
          {
            "id": "12345678",
            "type": "file",
            "content": "This is file content."
          }
        ],
        "metadata_template": {
            "template_key": "",
            "type": "metadata_template",
            "scope": ""
        },
        "fields": [
            {
              "key": "name",
              "description": "The name of the person.",
              "displayName": "Name",
              "prompt": "The name is the first and last name from the email address.",
              "type": "string",
              "options": [
                {
                  "key": "First Name"
                },
                {
                  "key": "Last Name"
                }
              ]
            }
        ],
        "ai_agent": {
          "type": "ai_agent_extract",
          "long_text": {
            "model": "azure__openai__gpt_4o_mini"
            },
          "basic_text": {
            "model": "azure__openai__gpt_4o_mini"
         }
      }
   }'
TypeScript Gen
await client.ai.createAiExtractStructured({
  fields: [
    {
      key: 'firstName',
      displayName: 'First name',
      description: 'Person first name',
      prompt: 'What is the your first name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'lastName',
      displayName: 'Last name',
      description: 'Person last name',
      prompt: 'What is the your last name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'dateOfBirth',
      displayName: 'Birth date',
      description: 'Person date of birth',
      prompt: 'What is the date of your birth?',
      type: 'date',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'age',
      displayName: 'Age',
      description: 'Person age',
      prompt: 'How old are you?',
      type: 'float',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'hobby',
      displayName: 'Hobby',
      description: 'Person hobby',
      prompt: 'What is your hobby?',
      type: 'multiSelect',
      options: [
        { key: 'guitar' } satisfies AiExtractStructuredFieldsOptionsField,
        { key: 'books' } satisfies AiExtractStructuredFieldsOptionsField,
      ],
    } satisfies AiExtractStructuredFieldsField,
  ],
  items: [new AiItemBase({ id: file.id })],
} satisfies AiExtractStructured);
Python Gen
client.ai.create_ai_extract_structured(
    [AiItemBase(id=file.id)],
    fields=[
        CreateAiExtractStructuredFields(
            key="firstName",
            display_name="First name",
            description="Person first name",
            prompt="What is the your first name?",
            type="string",
        ),
        CreateAiExtractStructuredFields(
            key="lastName",
            display_name="Last name",
            description="Person last name",
            prompt="What is the your last name?",
            type="string",
        ),
        CreateAiExtractStructuredFields(
            key="dateOfBirth",
            display_name="Birth date",
            description="Person date of birth",
            prompt="What is the date of your birth?",
            type="date",
        ),
        CreateAiExtractStructuredFields(
            key="age",
            display_name="Age",
            description="Person age",
            prompt="How old are you?",
            type="float",
        ),
        CreateAiExtractStructuredFields(
            key="hobby",
            display_name="Hobby",
            description="Person hobby",
            prompt="What is your hobby?",
            type="multiSelect",
            options=[
                CreateAiExtractStructuredFieldsOptionsField(key="guitar"),
                CreateAiExtractStructuredFieldsOptionsField(key="books"),
            ],
        ),
    ],
    ai_agent=agent_ignoring_overriding_embeddings_model,
)
.NET Gen
await client.Ai.CreateAiExtractStructuredAsync(requestBody: new AiExtractStructured(items: Array.AsReadOnly(new [] {new AiItemBase(id: file.Id)})) { Fields = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsField(key: "firstName") { DisplayName = "First name", Description = "Person first name", Prompt = "What is the your first name?", Type = "string" },new AiExtractStructuredFieldsField(key: "lastName") { DisplayName = "Last name", Description = "Person last name", Prompt = "What is the your last name?", Type = "string" },new AiExtractStructuredFieldsField(key: "dateOfBirth") { DisplayName = "Birth date", Description = "Person date of birth", Prompt = "What is the date of your birth?", Type = "date" },new AiExtractStructuredFieldsField(key: "age") { DisplayName = "Age", Description = "Person age", Prompt = "How old are you?", Type = "float" },new AiExtractStructuredFieldsField(key: "hobby") { DisplayName = "Hobby", Description = "Person hobby", Prompt = "What is your hobby?", Type = "multiSelect", Options = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsOptionsField(key: "guitar"),new AiExtractStructuredFieldsOptionsField(key: "books")}) }}) });
Java
BoxAIExtractMetadataTemplate template = new BoxAIExtractMetadataTemplate("templateKey", "enterprise");
BoxAIExtractStructuredResponse result = BoxAI.extractMetadataStructured(
    api,
    Collections.singletonList(new BoxAIItem("123456", BoxAIItem.Type.FILE)),
    template
);
JsonObject sourceJson = result.getSourceJson();

Response Example

{
  "ai_agent_info": {
    "models": [
      {
        "name": "azure__openai__text_embedding_ada_002",
        "provider": "azure",
        "supported_purpose": "embedding"
      }
    ]
  },
  "completion_reason": "done",
  "created_at": "2012-12-12T10:53:43-08:00"
}