Access levels
Core models
These models are built into Box AI and available by default for all customers. No configuration is required.Customer-enabled models
These models require activation by Box admins in the Admin Console or a request to Box to enable them. Some models may be subject to additional terms or pricing.Capability tiers
Standard models
Designed for high-speed, cost-efficient tasks like basic summarization, Q&A, and structured data extraction from shorter or simpler documents. Ideal for high-volume, low-complexity use cases.Premium models
Offer more advanced reasoning, larger context windows, and better performance on long-form, complex, or domain-specific content. Suitable for sophisticated tasks like multi-step reasoning, understanding large taxonomies, and analyzing lengthy or unstructured documents.A model can be both customer-enabled and premium, or core and standard. In other words, access level and capability tiers are independent categorizations (for example, models can be either capability tier regardless of access level). The two categorizations are complementary.
Using models
How to use the supported AI models:- get the default AI agent configuration,
- override the AI agent configuration used in
POST 2.0/ai/ask,POST 2.0/ai/text_gen,POST 2.0/ai/extract,POST 2.0/ai/extract_structuredendpoints.
model parameter your API calls, use the API Name visible on each CustomCard and model card.
For example, to get the AI agent configuration for a specific model, use the model parameter and provide the azure__openai__gpt_4o_mini API name. Make sure you use two underscores after the provider name.
The list may change depending on the model availability.Models offered in Beta mode have not been fully performance-tested at scale and are made available on an as-is basis. You may experience variability in model/output quality, availability, and accuracy.
Core Box AI models
Box AI is powered by the following AI models. These models are integrated with Box AI to facilitate various use cases while adhering to enterprise grade standards. Below, you’ll find information about each model, including its capabilities, intended applications, and applicable usage guidelines.openai__gpt_5_2
A multimodal model for coding and agentic tasks across industries.
openai__gpt_5_1
A multimodal model with enterprise-grade performance and adaptive reasoning.
openai__gpt_5
A multimodal model with advanced reasoning and long-context understanding.
openai__gpt_5_mini
A model designed for well-defined tasks and precise prompts.
azure__openai__gpt_4_1
A multimodal model, highly efficient in handling complex, multi-step tasks.
azure__openai__gpt_4_1_mini
A multimodal model designed to handle lightweight tasks.
azure__openai__gpt_4o
A multimodal model, highly efficient in handling complex, multi-step tasks.
azure__openai__gpt_4o_mini
A multimodal model designed to handle lightweight tasks.
azure__openai__text_embedding_ada_002
A most capable 2nd generation text embedding model. Skilled in text search, code search, and sentence similarity.
google__gemini_2_5_pro
Gemini multimodal model with a 1 million token context window and advanced reasoning capabilities.
google__gemini_2_5_flash
Gemini multimodal model offering well-round capabilites, including thinking capabilities.
google__gemini_2_0_flash_001
Gemini multimodal model designed for optimal high-volume, high-frequency tasks at scale.
google__gemini_2_0_flash_lite_preview
Gemini multimodal model designed to handle lightweight tasks.
aws__claude_4_5_sonnet
A model that excels at complex agents, coding, and autonomous multi-step workflows.
aws__claude_4_5_haiku
A fast model with near-frontier intelligence.
aws__claude_4_sonnet
A model that brings frontier performance to everyday use cases.
aws__claude_4_opus
A model that excels at coding and complex problem-solving, powering frontier agent products.
aws__claude_3_7_sonnet
A model designed to enhance language understanding and generation tasks
aws__claude_3_5_sonnet
A model designed to enhance language understanding and generation tasks.
aws__claude_3_sonnet
A model designed for advanced language tasks, focusing on comprehension and context handling.
aws__claude_3_haiku
A model tailored for various language tasks, including creative writing and conversational AI.
aws__titan_text_lite
A model capable of advanced language processing, handling extensive contexts, making it suitable for complex tasks.
ibm__llama_4_maverick
A natively multimodal model that utilizes a mixture-of-experts architecture for optimized resource use.
ibm__llama_4_scout
A natively multimodal AI model that enables text and multimodal experiences.
ibm__llama_3_2_90b_vision_instruct
A model built for document-level understanding, interpretation of charts and graphs, and captioning of images.
ibm__mistral_medium_2505
High-performance enterprise model for coding and advanced reasoning.
ibm__mistral_small_3_1_24b_instruct_2503
Fast open-source multimodal model with low latency.
Customer-enabled models
Certain Box AI customers may enable additional AI models upon their request and/or otherwise made available to them through their admin console. Use of these models may be subject to additional terms. By selecting a customer-enabled model, customer acknowledges that their data may be processed by additional subprocessors of their choice.google__gemini_3_pro
A natively multimodal model for complex tasks with a 1 million token context window.
google__gemini_3_flash
A natively multimodal model designed for speed and efficiency across a wide range of tasks.
xai__grok_3_beta
A model that excels at enterprise use cases like data extraction, coding, and text summarization.
xai__grok_3_mini_reasoning_beta
A lightweight model that is great for logic-based tasks that do not require deep domain knowledge.
aws__claude_4_5_opus
A premium model combining maximum intelligence with practical performance.
openai__gpt_o3
A multimodal model, highly efficient in handling complex, multi-step tasks.
