PDFs weren't designed for integration, but our tools were. Convert unstructured PDF content into clean, organized JSON data ready for your applications.
{ "transactions": [ { "date": "2023-10-15", "description": "Software License", "amount": 1250.00 }, { "date": "2023-10-18", "description": "Cloud Services", "amount": 89.99 }, { "date": "2023-10-20", "description": "Support Hours", "amount": 320.00 } ], "summary": { "total_income": 1570.00, "total_expenses": 89.99, "balance": 1480.01, "total": 1659.99 }, "client": { "name": "Acme Corporation", "email": "contact@acme.com" }, "company": { "name": "TechCorp Inc.", "address": "123 Business St., Suite 100", "city": "San Francisco", "state": "CA", "zip": "94103" }, "invoice": { "number": "2023-456", "date": "2023-10-25" } }
Extract exactly what you need from your PDFs with our flexible schema options. Simple or complex, we've got you covered.
Define your data structure with an intuitive YAML-like format. Perfect for straightforward extraction tasks and quick implementation.
- invoice: invoice data - items: list of items on the pdf - total: total value amount of item - name: item name - total: total amount of all items
Use standard JSON Schema for precise data typing and validation. Ideal for developers who need fine-grained control and integration with existing systems.
{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "invoice": { "type": "object", "properties": { "items": { "type": "array" // Additional properties omitted for brevity } } } } }
Subscribe to our PDF extraction API on RapidAPI and transform documents into structured data in minutes. No complex integration required.
curl -X POST "https://pdftoolkit.p.rapidapi.com/extract" \ -H "x-rapidapi-key: YOUR_RAPIDAPI_KEY" \ -H "Content-Type: multipart/form-data" \ -F "pdf=@invoice.pdf" \ -F "schema=@schema.json"
Have questions about our PDF extraction API? Need custom integration help? Our team is ready to assist you with any inquiries.