Run the full document pipeline in minutes.
Start server-side. Upload a document, wait for parsing, create a hybrid index, then stream a cited answer from that indexed context.
Install
Use the SDK in your backend service, or start with curl while wiring environment variables.
export INFRATEX_API_KEY=infratex_sk_...Get an API key
Create a key in the dashboard. The full key is shown once and should be stored in server secrets.
Authorization: Bearer infratex_sk_your_key_here# 1. Upload and parse
curl -X POST https://api.infratex.io/api/v1/documents \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-F "file=@contract.pdf" \
-F "method=standard"
# 2. Create a hybrid index
curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/indexes \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"method":"hybrid"}'
# 3. Stream a cited answer
curl -N -X POST https://api.infratex.io/api/v1/responses \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"message":"Summarize termination rights with citations.","method":"hybrid","model":"fast","document_ids":["{document_id}"],"limit":5}'Use the same resource model from Python or Node.
Python service
Best for ETL workers, extraction jobs, notebooks, and backend APIs.
pip install infratexNode.js service
Best for Next.js route handlers, Express services, queues, and streaming app backends.
npm install infratexfrom infratex import Infratex
client = Infratex(api_key="infratex_sk_...")
doc = client.documents.upload(
"board_pack.pdf",
method="standard",
collection_id="col_123",
)
client.documents.wait_until_parsed(doc.id)
markdown = client.documents.markdown(doc.id)
print(markdown[:1000])Upload files and retrieve parsed Markdown.
/api/v1/documentsHTTP 202Upload PDF
Multipart upload with file, method, optional pipeline for legacy, and optional collection_id.
/api/v1/documents/imagesHTTP 202Upload page images
Multipart upload with repeated files. File order is treated as page order.
/api/v1/documents/{id}Get document
Returns parse status, metadata, and index summaries.
/api/v1/documents/{id}/markdownGet Markdown
Returns extracted Markdown as text/markdown after parsing completes.
curl -X POST https://api.infratex.io/api/v1/documents \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-F "file=@report.pdf" \
-F "method=standard" \
-F "collection_id=col_123"Create retrieval artifacts before search or generation.
Semantic retrieval over document chunks. Good default for natural-language questions.
Semantic, keyword, and document-structure retrieval. Recommended for production contracts, filings, and tables.
/api/v1/documents/{id}/indexesHTTP 202Create index
Queues vector or hybrid indexing for a parsed document.
/api/v1/documents/{id}/indexes/{method}Get index status
Poll until status is indexed before search or response calls.
curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/indexes \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"method":"hybrid"}'Retrieve cited context without generating text.
Search is for previews, evidence panels, ranking inspection, and retrieval debugging. Send one scope: document_ids, collection_id, or conversation-backed scope through responses.
/api/v1/searchesSearch indexed context
Returns ranked chunks with document, page, score, content, and metadata.
curl -X POST https://api.infratex.io/api/v1/searches \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"query":"Find indemnity carve-outs","method":"hybrid","document_ids":["doc_123"],"limit":5}'Stream answers grounded in indexed documents.
Lower-latency response model for product surfaces, summaries, and routine Q&A.
Higher-capability response model for complex synthesis, legal analysis, and cross-document questions.
/api/v1/responsesCreate streaming response
Streams server-sent events: sources, thinking when enabled, text deltas, then done.
curl -N -X POST https://api.infratex.io/api/v1/responses \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{"message":"What are the top risks?","method":"hybrid","model":"fast","collection_id":"col_123","limit":8,"reasoning":false}'Extract structured fields with evidence.
Extraction runs accept either a reusable template_id or inline fields. Inline fields require a name, type, and description, and can include objects, arrays, enums, and field-specific instructions.
/api/v1/documents/{id}/extractionsHTTP 202Create extraction run
Queues a run against parsed Markdown and returns pending status.
/api/v1/extractions/{run_id}Poll or fetch result
Use include_evidence=true when you need evidence payloads in the response.
/api/v1/documents/{id}/extractionsList document runs
Returns prior extraction runs for a document with pagination.
/api/v1/extractions/{run_id}/exportExport tabular results
Download xlsx or csv when the result contains array<object> fields.
curl -X POST https://api.infratex.io/api/v1/documents/{document_id}/extractions \
-H "Authorization: Bearer $INFRATEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "fast",
"include_evidence": true,
"inline_fields": [
{
"name": "counterparty",
"type": "string",
"description": "The legal name of the counterparty"
},
{
"name": "effective_date",
"type": "date",
"description": "The contract effective date"
},
{
"name": "termination_fee",
"type": "number",
"description": "Any explicit termination fee amount"
}
]
}'Group documents into product-ready scopes.
Collections let you upload many documents into a durable scope and query them together with the same search and response APIs.
/api/v1/collectionsCreate collection
Create a named document group for retrieval and responses.
/api/v1/collectionsList collections
Return all tenant collections.
/api/v1/documents/{id}Move document
Set collection_id or remove_collection on an existing document.
/api/v1/collections/{id}Delete collection
Deletes the collection record and unsets it from documents.
{
"message": "Compare the warranty limits across the uploaded agreements.",
"method": "hybrid",
"model": "pro",
"collection_id": "col_123",
"limit": 10,
"reasoning": true
}Connect agent clients to the same pipeline.
The remote MCP server exposes document creation, indexing, retrieval, and grounded answer generation over the same tenant-scoped API key model.
Use streamable HTTP and pass the same Authorization: Bearer infratex_sk_... header.
{
"mcpServers": {
"infratex": {
"url": "https://api.infratex.io/mcp",
"headers": {
"Authorization": "Bearer infratex_sk_..."
}
}
}
}Queue PDF parsing from a base64 payload.
Queue parsing for ordered image batches.
Queue vector or hybrid indexing.
Run retrieval across documents or collections.
Generate cited answers from indexed context.
Core parameters and endpoint map.
| Parameter | Values | Use when |
|---|---|---|
Parse method | standard, max, legacy, cost-efficient, standard-html, standard-ultra-2, dots-mocr, infratex-phi | Controls parser quality, cost profile, or compatibility. |
Image parse method | standard, max, standard-html, standard-ultra-2, dots-mocr, infratex-phi | Used with ordered PNG, JPEG, or WebP page batches. |
Retrieval method | vector, hybrid | Use hybrid for exact terms, tables, identifiers, and audit-heavy workflows. |
Response model | fast, pro | Use fast for latency-sensitive product surfaces, pro for harder synthesis. |
reasoning | true, false | When true, response streams may include thinking events before text. |
Documents
POST /api/v1/documentsPOST /api/v1/documents/imagesGET /api/v1/documents/{id}GET /api/v1/documents/{id}/markdownGET /api/v1/documents/{id}/astRetrieval
POST /api/v1/documents/{id}/indexesGET /api/v1/documents/{id}/indexesPOST /api/v1/searchesPOST /api/v1/responsesExtraction
POST /api/v1/extraction-templatesGET /api/v1/extraction-templatesPOST /api/v1/documents/{id}/extractionsGET /api/v1/extractions/{run_id}GET /api/v1/extractions/{run_id}/exportAccount
GET /api/v1/accountGET /api/v1/billingPOST /api/v1/keysGET /api/v1/collectionsPOST /api/v1/collectionsSearch and response calls require a ready index for the selected method. If you request hybrid, the selected documents or collection must already have a hybrid index.