Infratex
/ API DOCUMENTATION

Integrate document AI without stitching five systems together.

Infratex gives your backend one pipeline for parsing PDFs, indexing clean document context, retrieving cited chunks, streaming answers, and extracting structured fields.

Start with SDKs Endpoint reference
Base URL
api.infratex.io
Auth
Bearer API key
SDKs
Python + Node
integration_runready
Upload01

Queue a PDF or ordered image batch

Parse02

Receive Markdown, page metadata, and status

Index03

Build vector or hybrid retrieval artifacts

Answer04

Stream sources, thinking, and text events

curl -X POST https://api.infratex.io/api/v1/documents \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -F "file=@contract.pdf" \
  -F "method=standard"

Server-side keys

Direct API calls use infratex_sk_... bearer tokens. Keep them out of browser code.

Async resources

Uploads, indexes, and extraction runs return 202 and expose status endpoints for polling.

Explicit scope

Search and responses accept document_ids, collection_id, or conversation_id depending on the workflow.

Citations by default

Response streams start with source events before model text so UIs can render evidence immediately.

/ QUICKSTART

Run the full document pipeline in minutes.

Start server-side. Upload a document, wait for parsing, create a hybrid index, then stream a cited answer from that indexed context.

01

Install

Use the SDK in your backend service, or start with curl while wiring environment variables.

export INFRATEX_API_KEY=infratex_sk_...
02

Get an API key

Create a key in the dashboard. The full key is shown once and should be stored in server secrets.

http
Authorization: Bearer infratex_sk_your_key_here
# 1. Upload and parse
curl -X POST https://api.infratex.io/api/v1/documents \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -F "file=@contract.pdf" \
  -F "method=standard"

# 2. Create a hybrid index
curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/indexes \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"method":"hybrid"}'

# 3. Stream a cited answer
curl -N -X POST https://api.infratex.io/api/v1/responses \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message":"Summarize termination rights with citations.","method":"hybrid","model":"fast","document_ids":["{document_id}"],"limit":5}'
/ SDKS

Use the same resource model from Python or Node.

Python service

Best for ETL workers, extraction jobs, notebooks, and backend APIs.

pip install infratex

Node.js service

Best for Next.js route handlers, Express services, queues, and streaming app backends.

npm install infratex
from infratex import Infratex

client = Infratex(api_key="infratex_sk_...")

doc = client.documents.upload(
    "board_pack.pdf",
    method="standard",
    collection_id="col_123",
)
client.documents.wait_until_parsed(doc.id)

markdown = client.documents.markdown(doc.id)
print(markdown[:1000])
/ DOCUMENTS

Upload files and retrieve parsed Markdown.

POST/api/v1/documentsHTTP 202

Upload PDF

Multipart upload with file, method, optional pipeline for legacy, and optional collection_id.

POST/api/v1/documents/imagesHTTP 202

Upload page images

Multipart upload with repeated files. File order is treated as page order.

GET/api/v1/documents/{id}

Get document

Returns parse status, metadata, and index summaries.

GET/api/v1/documents/{id}/markdown

Get Markdown

Returns extracted Markdown as text/markdown after parsing completes.

curl -X POST https://api.infratex.io/api/v1/documents \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -F "file=@report.pdf" \
  -F "method=standard" \
  -F "collection_id=col_123"
/ INDEXES

Create retrieval artifacts before search or generation.

vector

Semantic retrieval over document chunks. Good default for natural-language questions.

hybrid

Semantic, keyword, and document-structure retrieval. Recommended for production contracts, filings, and tables.

POST/api/v1/documents/{id}/indexesHTTP 202

Create index

Queues vector or hybrid indexing for a parsed document.

GET/api/v1/documents/{id}/indexes/{method}

Get index status

Poll until status is indexed before search or response calls.

curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/indexes \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"method":"hybrid"}'
/ RESPONSES

Stream answers grounded in indexed documents.

fast

Lower-latency response model for product surfaces, summaries, and routine Q&A.

pro

Higher-capability response model for complex synthesis, legal analysis, and cross-document questions.

POST/api/v1/responses

Create streaming response

Streams server-sent events: sources, thinking when enabled, text deltas, then done.

curl -N -X POST https://api.infratex.io/api/v1/responses \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message":"What are the top risks?","method":"hybrid","model":"fast","collection_id":"col_123","limit":8,"reasoning":false}'
/ EXTRACTION

Extract structured fields with evidence.

Extraction runs accept either a reusable template_id or inline fields. Inline fields require a name, type, and description, and can include objects, arrays, enums, and field-specific instructions.

POST/api/v1/documents/{id}/extractionsHTTP 202

Create extraction run

Queues a run against parsed Markdown and returns pending status.

GET/api/v1/extractions/{run_id}

Poll or fetch result

Use include_evidence=true when you need evidence payloads in the response.

GET/api/v1/documents/{id}/extractions

List document runs

Returns prior extraction runs for a document with pagination.

GET/api/v1/extractions/{run_id}/export

Export tabular results

Download xlsx or csv when the result contains array<object> fields.

curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/extractions \
  -H "Authorization: Bearer $INFRATEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fast",
    "include_evidence": true,
    "inline_fields": [
      {
        "name": "counterparty",
        "type": "string",
        "description": "The legal name of the counterparty"
      },
      {
        "name": "effective_date",
        "type": "date",
        "description": "The contract effective date"
      },
      {
        "name": "termination_fee",
        "type": "number",
        "description": "Any explicit termination fee amount"
      }
    ]
  }'
/ COLLECTIONS

Group documents into product-ready scopes.

Collections let you upload many documents into a durable scope and query them together with the same search and response APIs.

POST/api/v1/collections

Create collection

Create a named document group for retrieval and responses.

GET/api/v1/collections

List collections

Return all tenant collections.

PATCH/api/v1/documents/{id}

Move document

Set collection_id or remove_collection on an existing document.

DELETE/api/v1/collections/{id}

Delete collection

Deletes the collection record and unsets it from documents.

json
{
  "message": "Compare the warranty limits across the uploaded agreements.",
  "method": "hybrid",
  "model": "pro",
  "collection_id": "col_123",
  "limit": 10,
  "reasoning": true
}
/ MCP

Connect agent clients to the same pipeline.

The remote MCP server exposes document creation, indexing, retrieval, and grounded answer generation over the same tenant-scoped API key model.

Endpoint
https://api.infratex.io/mcp

Use streamable HTTP and pass the same Authorization: Bearer infratex_sk_... header.

json
{
  "mcpServers": {
    "infratex": {
      "url": "https://api.infratex.io/mcp",
      "headers": {
        "Authorization": "Bearer infratex_sk_..."
      }
    }
  }
}
create_document

Queue PDF parsing from a base64 payload.

create_document_images

Queue parsing for ordered image batches.

create_index

Queue vector or hybrid indexing.

search_documents

Run retrieval across documents or collections.

answer_documents

Generate cited answers from indexed context.

/ REFERENCE

Core parameters and endpoint map.

ParameterValuesUse when
Parse methodstandard, max, legacy, cost-efficient, standard-html, standard-ultra-2, dots-mocr, infratex-phiControls parser quality, cost profile, or compatibility.
Image parse methodstandard, max, standard-html, standard-ultra-2, dots-mocr, infratex-phiUsed with ordered PNG, JPEG, or WebP page batches.
Retrieval methodvector, hybridUse hybrid for exact terms, tables, identifiers, and audit-heavy workflows.
Response modelfast, proUse fast for latency-sensitive product surfaces, pro for harder synthesis.
reasoningtrue, falseWhen true, response streams may include thinking events before text.

Documents

POST /api/v1/documentsPOST /api/v1/documents/imagesGET /api/v1/documents/{id}GET /api/v1/documents/{id}/markdownGET /api/v1/documents/{id}/ast

Retrieval

POST /api/v1/documents/{id}/indexesGET /api/v1/documents/{id}/indexesPOST /api/v1/searchesPOST /api/v1/responses

Extraction

POST /api/v1/extraction-templatesGET /api/v1/extraction-templatesPOST /api/v1/documents/{id}/extractionsGET /api/v1/extractions/{run_id}GET /api/v1/extractions/{run_id}/export

Account

GET /api/v1/accountGET /api/v1/billingPOST /api/v1/keysGET /api/v1/collectionsPOST /api/v1/collections
Readiness invariant

Search and response calls require a ready index for the selected method. If you request hybrid, the selected documents or collection must already have a hybrid index.