Skip to main content
A markdown representation provides a way to extract text from a document while retaining structures such as headers, tables, lists, and formatting cues. Markdown conversion speeds depend on file size and content.

Supported file types

Markdown representations work with the following document formats:
  • Microsoft Office: Word (.docx), PowerPoint (.pptx), Excel (.xls, .xlsx, .xlsm)
  • Google Workspace: Google Docs (.gdoc), Google Slides (.gslide, .gslides),
  • Google Sheets (.gsheet)
  • PDF files (.pdf)

Create a markdown representation

This process describes how to generate and download a Markdown representation of a file using the Box API. To first create a markdown representation: Generated representations are cached by Box for subsequent requests.

Get available representations

Request the file’s available representations, specifying a Markdown representation hint.
curl -X GET \ 
  'https://api.box.com/2.0/files/12345?fields=id%2Cname%2Crepresentations' \ 
  -H 'X-Rep-Hints: [markdown]' \ 
  -H 'Authorization: Bearer YOUR_ACCESS_TOKEN'

Response example

{
  "type": "file",
  "id": "{file_id}",
  "etag": "1",
  "name": "test.docx",
  "representations": {
    "entries": [
      {
        "representation": "markdown",
        "properties": {},
        "info": {
          "url": "https://api.box.com/2.0/internal_files/{file_id}/versions/{version_id}/representations/{representation}"
        },
        "status": {
          "state": "none"
        },
        "content": {
          "url_template": "https://dl.boxcloud.com/api/2.0/internal_files/{file_id}/versions/{version_id}/representations/{representation}/content/{+asset_path}"
        }
      }
    ]
  }
}

Trigger Markdown generation

If the state is none, request the info URL to begin generating the Markdown representation.
curl -X GET \ 
  'https://api.box.com/2.0/internal_files/12345/versions/2200612360399/representations/markdown' \ 
  -H 'Authorization: Bearer YOUR_ACCESS_TOKEN'

Check representation status

Re-query the file to check the current status of the Markdown representation.
curl -X GET \ 
  'https://api.box.com/2.0/files/12345?fields=id%2Cname%2Crepresentations' \ 
  -H 'X-Rep-Hints: [markdown]' \ 
  -H 'Authorization: Bearer YOUR_ACCESS_TOKEN'
When the status changes from pending to success, the Markdown file is ready to download. The status indicates if the representation is available. Available options are success, viewable, pending, or none. Success means you can immediately download the representation, while none indicates the representation can be generated.

Download the Markdown representation

Once the representation is ready, use the content/url_template to download it.
curl -L \ 
'https://dl.boxcloud.com/api/2.0/internal_files/12345/versions/1415005153353/representations/markdown/content/index.md' \
  -H 'Authorization: Bearer YOUR_ACCESS_TOKEN' \
  -o file_name.md