Skip to content

Segment documents programmatically · Part of the PDF API

Extract PDF Pages API

Pull specific pages from any PDF into a new file—without touching the original. Perfect for separating invoices, isolating exhibits, or building document routing workflows that need surgical precision.

DEV OPS BIZ

Trusted by 8,200+ developers

No credit card required · Keep links, forms, and formatting intact

Extract pages 3-5 from a document

Range syntax
# cURL
curl -X POST "https://api.xspdf.com/v1/pdf/extract" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_url": "https://files.example.com/report.pdf",
    "pages": "3-5",
    "output_filename": "section_b.pdf"
  }' -o section_b.pdf

Avg. latency

< 650ms

Accuracy

100%

Original

Intact

Need to route extracted invoices? See Separating Invoices for workflow patterns.

142M

pages extracted successfully

< 650ms

typical extraction latency

100%

original file preserved

99.96%

uptime SLA with full logging

You need one page. The PDF has fifty.

Your workflow receives a bundled PDF—invoices, contracts, statements—and you need to segment it for routing, archival, or approval. Manual extraction is tedious. Batch tools are fragile. You need surgical precision that scales.

Manual extraction hell

Open, select, save as, repeat 50 times per batch.

Brittle automation

Scripts break when page counts change or files get corrupted.

Data loss risk

Extract the wrong range and lose critical pages forever.

The hidden cost

Teams spend 15-30 minutes per day manually extracting pages from bundled PDFs. That's 5-10 hours per month—just splitting documents before the real work begins.

Extract specific pages into new files—keep originals intact

Our Extract PDF Pages API pulls exact page ranges from any PDF and outputs a clean new file. The original stays untouched. Perfect for document routing, invoice segmentation, or building user-facing "download page X" features.

Flexible range syntax

Specify 1, 3-5, 1,3,7-10 or any combination. Perfect for complex segmentation logic.

Original stays safe

Extraction creates a new file. Your source PDF remains untouched—critical for compliance workflows where audit trails matter.

Keep formatting, links, and forms

Extracted pages retain hyperlinks, form fields, and document structure—not just flattened images.

Pro tip: Combine with our Merge API to extract, reorder, and reassemble documents in one pipeline.

Page range syntax examples

Flexible
# Single page
"pages": "5"

# Range
"pages": "3-7"

# Multiple ranges
"pages": "1,3,7-10,15"

# First and last pages only
"pages": "1,50"

# Extract everything except first page
"pages": "2-"

Building invoice routing? Our Separating Invoices guide shows common extraction patterns for AP automation.

Before vs. after: Extract pages without losing your mind

Manual extraction doesn't scale. This API removes the bottleneck.

Manual extraction (the old way)

  • • Open each PDF individually
  • • Count pages manually
  • • Select, extract, save as (repeat)
  • • Risk extracting wrong pages
  • • 15-30 minutes per batch

API extraction (the new way)

  • • One API call per document
  • • Specify exact ranges programmatically
  • • Automated, logged, auditable
  • • Original files stay safe
  • • < 1 second per extraction

Real ROI:

Teams extracting 50+ pages per day save 10-15 hours per month—time that was pure toil, not value creation.

Built for developers who need precision

Not just "split a PDF." Build routing logic, conditional extraction, and user-facing segmentation tools.

Flexible range syntax

Single pages, ranges, or complex combinations like 1,5-8,12.

Original file preserved

Extraction creates a new file—your source PDF stays intact for audit trails.

Keep structure & interactivity

Extracted pages retain hyperlinks, form fields, and document metadata.

Fast extraction (< 650ms)

Typical latency under 650ms—fast enough for real-time user interactions.

Composable with other endpoints

Chain with merge, rotate, or protect for multi-step document pipelines.

Full audit logging

Every extraction logged with timestamps, user context, and page ranges.

FAQ: Extract PDF pages without breaking things

The questions teams ask before automating page extraction.

Does extracting pages modify the original PDF?

No. Extraction creates a new PDF file containing only the specified pages. The source document remains completely untouched—critical for workflows where you need to maintain the original for compliance, audit trails, or archival purposes.

What happens to hyperlinks, bookmarks, and form fields?

They're preserved. Extracted pages retain clickable links, interactive form fields, and document structure. The API copies the actual page objects—not just rasterized images—so interactivity survives extraction.

Can I extract pages from password-protected PDFs?

Yes—if you provide the password. Add a password parameter to the request. If you need to remove password protection entirely, use our Unlock PDF API first.

What's the difference between Extract and Split?

Extract creates one new file with specific pages you choose (e.g., pages 3-5). Split (see Split PDF API) breaks a document into multiple files—like one file per page, or by page count. Use Extract when you know exactly which pages you need. Use Split for bulk unbundling.

Stop manually extracting pages. Automate segmentation in one API call.

Extract specific page ranges from any PDF programmatically. Keep originals intact, preserve interactivity, and build document routing workflows that scale. Start free—no credit card required.

Also known as:

  • Extract PDF Pages API
  • • Split PDF by page API
  • • Get single page PDF
  • • Page segmentation API
  • • PDF range extractor