Extract structured data from any document
How it works
Three simple steps to extract structured data from any document
Upload
Send your document via multipart upload. Supports PDF, DOCX, PNG, JPG, and TXT formats.
Extract
AI analyzes the document structure and extracts fields using pre-built or custom schemas.
Get JSON
Receive clean, validated JSON with extracted fields. Ready for your application.
Built for developers
Everything you need to extract data from documents at scale
Multiple Formats
PDF, DOCX, PNG/JPG images, and plain text. Upload any common document format.
Custom Schemas
Define your own JSON output format with a schema. Get exactly the fields you need.
Pre-built Extractors
Invoice and Resume extractors ready to use out of the box. More coming soon.
Fast Response
Average response time under 3 seconds. Real-time extraction for your workflows.
Pay Per Page
Starting at $0.0005 per page. Free tier includes 50 pages/month. No hidden costs.
Secure by Default
API key authentication. Documents processed in memory. No data stored after extraction.
Start in minutes
Simple REST API. Use any language. Here are some examples.
curl -X POST https://api.docextract.ink/extract/invoice \
-H "X-API-Key: YOUR_API_KEY" \
-F "file=@invoice.pdf"
# Response:
# {
# "success": true,
# "data": {
# "vendor": "Acme Corp",
# "invoice_number": "INV-2026-0042",
# "date": "2026-05-15",
# "total": 9900.00,
# "currency": "USD",
# "line_items": [
# { "description": "Cloud API", "qty": 1, "amount": 9900.00 }
# ]
# },
# "pages": 1,
# "format": "pdf"
# }
Simple, transparent pricing
Start free. Scale as you grow. No surprises.
Free
Perfect for testing
- 50 pages / month
- Custom schemas
- Vision (images)
- Community support
Basic
For side projects
- 500 pages / month
- $0.05 / page overage
- Custom schemas
- Email support
Pro
For growing products
- 2,000 pages / month
- $0.03 / page overage
- Custom schemas
- Priority support
Business
For production apps
- 10,000 pages / month
- $0.02 / page overage
- Custom schemas
- Dedicated support
API Documentation
Everything you need to integrate DocExtract into your application
๐ Authentication
All API requests require an X-API-Key header with your API key.
curl https://api.docextract.ink/extract/invoice \
-H "X-API-Key: sk_live_your_api_key_here"
โ ๏ธ Keep your API key secret. Do not expose it in client-side code or public repositories.
๐ก Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /extract/invoice | Extract invoice data |
| POST | /extract/resume | Extract resume/CV data |
| POST | /extract/custom | Extract with custom JSON schema |
| GET | /health | API health check |
| GET | /usage | Current usage stats |
Base URL: https://api.docextract.ink
๐งพ Invoice Extraction
Extract structured data from invoices. Automatically detects vendor, amounts, line items, and more.
Request
POST /extract/invoice
Content-Type: multipart/form-data
# Form fields:
# file: (binary) โ the document file (required)
# language: string โ hint for OCR language (optional, e.g. "en")
Response
{
"success": true,
"data": {
"vendor": "Acme Corp",
"vendor_address": "123 Business Ave, NY 10001",
"invoice_number": "INV-2026-0042",
"date": "2026-05-15",
"due_date": "2026-06-15",
"subtotal": 9000.00,
"tax": 900.00,
"total": 9900.00,
"currency": "USD",
"line_items": [
{
"description": "Cloud API โ Annual License",
"quantity": 1,
"unit_price": 9000.00,
"amount": 9000.00
}
]
},
"pages": 1,
"format": "pdf",
"processing_ms": 1847
}
๐ Resume Extraction
Extract structured data from resumes and CVs. Detects personal info, experience, education, and skills.
Request
POST /extract/resume
Content-Type: multipart/form-data
# Form fields:
# file: (binary) โ the document file (required)
Response
{
"success": true,
"data": {
"name": "Jane Smith",
"email": "jane@example.com",
"phone": "+1-555-0123",
"summary": "Senior software engineer with 8+ years...",
"experience": [
{
"company": "TechCorp",
"title": "Senior Engineer",
"start_date": "2022-01",
"end_date": "present",
"description": "Led backend team..."
}
],
"education": [
{
"institution": "MIT",
"degree": "B.S. Computer Science",
"year": 2018
}
],
"skills": ["Python", "TypeScript", "AWS", "PostgreSQL"]
},
"pages": 2,
"format": "pdf",
"processing_ms": 2341
}
๐ฏ Custom Schema Extraction
Define your own output schema. The AI will extract matching fields from any document.
Request
POST /extract/custom
Content-Type: multipart/form-data
# Form fields:
# file: (binary) โ the document file (required)
# schema: (string) โ JSON schema definition (required)
# instructions: (string) โ additional extraction hints (optional)
Schema example
{
"company_name": "string",
"contract_value": "number",
"start_date": "date",
"end_date": "date",
"parties": ["string"],
"key_terms": ["string"]
}
๐ก
Supported types: string, number, date, boolean, arrays of any type, and nested objects.
โ ๏ธ Error Codes
| Code | Meaning | Resolution |
|---|---|---|
| 400 | Bad Request | Check request body and parameters |
| 401 | Unauthorized | Invalid or missing API key |
| 413 | File Too Large | Max file size is 20 MB |
| 415 | Unsupported Format | Use PDF, DOCX, PNG, JPG, or TXT |
| 422 | Extraction Failed | AI could not extract data โ try a clearer document |
| 429 | Rate Limited | Too many requests โ wait and retry |
| 500 | Server Error | Internal error โ contact support |
Error response format
{
"success": false,
"error": {
"code": 415,
"message": "Unsupported file format: .xlsx",
"hint": "Supported formats: pdf, docx, png, jpg, txt"
}
}
โฑ๏ธ Rate Limits
| Plan | Requests / minute | Concurrent |
|---|---|---|
| Free | 5 | 1 |
| Basic | 20 | 3 |
| Pro | 60 | 10 |
| Business | 120 | 25 |
Rate limit headers are included in every response:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1717373400
Try it live
Paste document text below and see extraction in action