> ## Documentation Index
> Fetch the complete documentation index at: https://developer.box.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Box and Weaviate

> Build an end-to-end RAG workflow by embedding Box content into Weaviate and querying it with the Weaviate Query Agent.

export const RelatedLinks = ({title, items = []}) => {
  const getBadgeClass = badge => {
    if (!badge) return "badge-default";
    const badgeType = badge.toLowerCase().replace(/\s+/g, "-");
    return `badge-${badge === "ガイド" ? "guide" : badgeType}`;
  };
  if (!items || items.length === 0) {
    return null;
  }
  return <div className="my-8">
      {}
      <h3 className="text-sm font-bold uppercase tracking-wider mb-4">{title}</h3>

      {}
      <div className="flex flex-col gap-3">
        {items.map((item, index) => <a key={index} href={item.href} className="py-2 px-3 rounded related_link hover:bg-[#f2f2f2] dark:hover:bg-[#111827] flex items-center gap-3 group no-underline hover:no-underline border-b-0">
            {}
            <span className={`px-2 py-1 rounded-full text-xs font-semibold uppercase tracking-wide flex-shrink-0 ${getBadgeClass(item.badge)}`}>
              {item.badge}
            </span>

            {}
            <span className="text-base">{item.label}</span>
          </a>)}
      </div>
    </div>;
};

export const SignupCTA = ({children}) => {
  return <div className="flex flex-wrap items-center gap-4 p-5 rounded-lg border border-gray-200 dark:border-gray-700 my-6" style={{
    background: "linear-gradient(135deg, rgba(0, 97, 213, 0.06), rgba(0, 97, 213, 0.02))"
  }}>
      <div className="flex-1 text-sm leading-relaxed text-gray-700 dark:text-gray-300" style={{
    minWidth: "280px"
  }}>
        {children}
      </div>
      <div className="flex flex-col items-center gap-2">
        <a href="https://account.box.com/signup/developer#ty9l3" className="signup-cta-button inline-flex items-center whitespace-nowrap px-5 py-2 text-sm font-semibold text-white no-underline">
          Get started for free
        </a>
        <a href="https://account.box.com/developers/console" className="signup-cta-login text-xs text-gray-500 dark:text-gray-400 no-underline whitespace-nowrap">
          Already have an account? Log in
        </a>
      </div>
    </div>;
};

This tutorial walks you through building a retrieval-augmented generation
(RAG) workflow by embedding Box content into a
[Weaviate](https://weaviate.io/) vector database and using Weaviate's
[Query Agent](https://weaviate.io/blog/weaviate-agents) to answer questions
about your data.

The complete recipe is available as a Jupyter Notebook in the
[Weaviate recipes repository](https://github.com/weaviate/recipes/tree/main/integrations/data-platforms/box).

<SignupCTA>
  A free developer account gives you access to the Box API and everything
  you need to build AI-powered workflows with Weaviate.
</SignupCTA>

## Overview

### What is Weaviate?

[Weaviate](https://weaviate.io/) is an open-source vector database built for
speed, scale, and AI-driven search. It stores data as objects and vectors,
letting you combine semantic search via embeddings with structured filtering.
Weaviate is cloud native, fault tolerant, and integrates directly with large
language models (LLMs).

### How Box and Weaviate create an end-to-end RAG solution

RAG is a technique that pairs vector search (retrieval) with a language model
(generation) to answer questions using your own data. The flow works as
follows:

* **Content storage**: Box holds your files (PDFs, docs, text reports,
  and other supported formats).
* **Embedding creation**: Text is extracted from your Box files, chunked,
  and converted into vector embeddings using
  [Weaviate Embeddings](https://weaviate.io/developers/wcs/embeddings).
* **Querying**: Weaviate's Query Agent takes a natural language question,
  generates the necessary search and aggregation queries, and returns a
  single answer — all using agentic RAG.

## Prerequisites

Before you begin, make sure you have the following:

* A Box developer account. If you don't already have one,
  [sign up for a free developer account](https://account.box.com/signup/developer#ty9l3).
* A Jupyter Notebook environment such as
  [Visual Studio Code](https://code.visualstudio.com/docs/datascience/jupyter-notebooks)
  with the Jupyter extension, or a local Jupyter installation.
* A Weaviate Cloud account.
  [Sign up for a free sandbox tier](https://console.weaviate.cloud/).

## Get a Box developer token

1. Click **New App** in the top right corner.
2. Enter an app name and select **OAuth 2.0** as the authentication method.
3. Click **Create App**.
4. Under **Application Scopes**, add read/write scopes for files if not already enabled, then
   click **Save Changes**.
5. From the **Configuration** tab, copy and save the developer token. You will need it for the notebook.

<Note>
  Developer tokens are valid for 60 minutes. If your session takes longer,
  you will need to generate a new token.
</Note>

## Create a Weaviate cluster

1. Log in to [Weaviate Cloud](https://console.weaviate.cloud/).
2. Create a new cluster from the dashboard. You can name it whatever you
   like.
3. Once the cluster is ready, go to the **Details** tab and note the
   **cluster URL** and **API key**.

## Run the recipe

### Clone the repository

Clone or download the
[Weaviate recipes repository](https://github.com/weaviate/recipes):

```bash theme={null}
git clone https://github.com/weaviate/recipes.git
```

Navigate to the Box integration folder:

```bash theme={null}
cd recipes/integrations/data-platforms/box
```

Open the Jupyter Notebook (`weaviate_box.ipynb`) in your development
environment.

<Frame caption="The Box integration folder in the Weaviate recipes repository">
  <img src="https://mintcdn.com/box/nQi7jppEz_5O1YgH/images/ai/vector-databases/weaviate-repo-structure.png?fit=max&auto=format&n=nQi7jppEz_5O1YgH&q=85&s=64e1685ca7291d4489af517bebfe5b70" alt="Weaviate recipes repository structure" width="1100" height="758" data-path="images/ai/vector-databases/weaviate-repo-structure.png" />
</Frame>

### Configure authentication

The notebook includes a step to set authentication variables. Update the
code block in **step 3** with:

* Your **Box developer token**
* Your **Weaviate cluster URL** and **API key**

<Frame caption="Update the authentication variables in step 3 of the notebook">
  <img src="https://mintcdn.com/box/nQi7jppEz_5O1YgH/images/ai/vector-databases/weaviate-auth-variables.png?fit=max&auto=format&n=nQi7jppEz_5O1YgH&q=85&s=373758987f12f62172cbe2425ce61734" alt="Authentication variables in the notebook" width="1100" height="267" data-path="images/ai/vector-databases/weaviate-auth-variables.png" />
</Frame>

### Run the notebook

Execute each cell in the notebook sequentially. The notebook:

1. Uploads demo files to Box (or uses files you provide).
2. Extracts text content from the Box files.
3. Chunks the text and creates vector embeddings in Weaviate.
4. Uses the Weaviate Query Agent to answer questions about the content.

<Tip>
  The repository includes a `demo_files` folder with four 10-K financial
  reports for testing. You can replace these with your own files if you prefer
  to work with different content.
</Tip>

The final cell demonstrates querying your data. You can modify the query in
**step 7** to ask different questions based on your content.

<Frame caption="The Query Agent returns an answer based on your Box content">
  <img src="https://mintcdn.com/box/nQi7jppEz_5O1YgH/images/ai/vector-databases/weaviate-final-answer.png?fit=max&auto=format&n=nQi7jppEz_5O1YgH&q=85&s=8797fd9108430cf91ab04d468d1dee90" alt="Final answer from the Weaviate Query Agent" width="1100" height="286" data-path="images/ai/vector-databases/weaviate-final-answer.png" />
</Frame>

## Next steps

<AccordionGroup>
  <Accordion title="Expand your data">
    Upload additional files to Box such as annual reports, articles, or any
    documents you want to search, and rerun the notebook to embed them in
    Weaviate.
  </Accordion>

  <Accordion title="Customize the Query Agent">
    Adjust the `system_prompt` parameter to change the agent's behavior.
    For example, you can request more detailed analysis or a specific
    response format.
  </Accordion>

  <Accordion title="Explore other Weaviate agents">
    Weaviate offers additional agent types. The
    [Transformation Agent](https://docs.weaviate.io/agents#transformation-agent) can
    preprocess your data, and the
    [Personalization Agent](https://docs.weaviate.io/agents#personalization-agent) can
    tailor responses to individual users.
  </Accordion>
</AccordionGroup>

## Resources

<CardGroup cols={2}>
  <Card title="Weaviate recipe" href="https://github.com/weaviate/recipes/tree/main/integrations/data-platforms/box" icon="github">
    The complete Jupyter Notebook in the Weaviate recipes repository.
  </Card>

  <Card title="Weaviate Query Agent" href="https://weaviate.io/blog/weaviate-agents" icon="robot">
    Learn about Weaviate's agentic RAG capabilities.
  </Card>

  <Card title="Weaviate Embeddings" href="https://weaviate.io/developers/wcs/embeddings" icon="brain">
    Documentation for Weaviate's embedding service.
  </Card>

  <Card title="Box Developer Community" href="https://community.box.com/box-platform-5" icon="comments">
    Share feedback and get support from other Box developers.
  </Card>
</CardGroup>

<RelatedLinks
  title="RELATED RESOURCES"
  items={[
{ label: translate("AI integrations"), href: "/ai/integrations", badge: "GUIDE" },
{ label: translate("Box and Pinecone"), href: "/ai/vector-databases/pinecone", badge: "GUIDE" },
{ label: translate("Get started with Box AI"), href: "/guides/box-ai/ai-tutorials/prerequisites", badge: "GUIDE" }
]}
/>
