Developer Tools & APIs

Landing AI

A capable PDF-to-markdown API for complex financial and scanned PDFs, with strong table and chart extraction but inconsistent heading semantics.

Visit Landing AI

Hybrid + scanned PDFsStrong table reconstructionCharts ExtractionHeading semantics mixed

Best when table fidelity matters more than markdown semantics

Landing AI handled the core PDF-to-markdown job well across hybrid, table-heavy, and scanned documents: it accepted all tested files, returned downloadable markdown through a fully automated API flow, reconstructed several financial tables cleanly, and converted charts into usable text summaries instead of dropping them. The tradeoff is structure fidelity at the semantic level: major headings were not consistently preserved as headings, some nested table headers were flattened, and scanned table OCR could introduce value errors. It looks useful for ingestion pipelines that need broad document coverage, but not for users who need exact markdown hierarchy or perfect scanned-table accuracy.

Hybrid earnings report walkthrough

In-Depth Review

Our detailed analysis of Landing AI — features, performance, and real-world testing.

AI Demos Team

Expert Reviewer

Verified Review

Feature-by-Feature Breakdown

Document hierarchy and reading-order reconstruction

Usually preserves section flow and OCR reading order, but heading semantics are inconsistent.

▾

Test Summary

Feature tested: Document hierarchy and reading-order reconstruction

Result: Partial — Usually preserves section flow and OCR reading order, but heading semantics are inconsistent.

Feature tested: Document hierarchy and reading-order reconstruction

Result: Partial

Verdict: Usually preserves section flow and OCR reading order, but heading semantics are inconsistent.

Expected behavior: Reconstructs page-level structure from both native and scanned PDFs into readable markdown-like text. This was exercised on a Target annual report section ('19. Commitments and Contingencies'), a Sumitomo financial report page with section/subsection headings, a scanned two-column research page headed 'STUDY AREA', and a full annual-report page with a portrait and bullet list.

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Target annual report section titled '19. Commitments and Contingencies' with the 'Data Breach' subsection. — landing-ai-target-annual-report-commitments-contingencies-data-breach.png

Observed output: Output artifact (Image): Landing AI preserved the section heading, subsection title, and the full body paragraphs from the Target annual report section. In this example, the reading flo — landing-ai-parsed-commitments-contingencies-data-breach.png

Input artifact: Input artifact (Image): Target annual report section titled '19. Commitments and Contingencies' with the 'Data Breach' subsection. — landing-ai-target-annual-report-commitments-contingencies-data-breach.png

Output artifact: Output artifact (Image): Landing AI preserved the section heading, subsection title, and the full body paragraphs from the Target annual report section. In this example, the reading flo — landing-ai-parsed-commitments-contingencies-data-breach.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Sumitomo financial report page titled 'I. Summary of Operating Performance'. — landing-ai-financial-report-operating-performance-page.png

Observed output: Output artifact (Image): On the table-heavy financial report, Landing AI kept the main section title, the subsection title, and the body paragraphs in readable order. The extracted outp — landing-ai-financial-report-operating-performance-parsed-hierarchy.png

Input artifact: Input artifact (Image): Sumitomo financial report page titled 'I. Summary of Operating Performance'. — landing-ai-financial-report-operating-performance-page.png

Output artifact: Output artifact (Image): On the table-heavy financial report, Landing AI kept the main section title, the subsection title, and the body paragraphs in readable order. The extracted outp — landing-ai-financial-report-operating-performance-parsed-hierarchy.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Scanned two-column research page with 'STUDY AREA' and 'STAND PRESCRIPTIONS' sections. — landing-ai-scanned-two-column-text-study-area.png

Observed output: Output artifact (Image): On the scanned multi-column page, Landing AI OCR'd the 'STUDY AREA' section into a coherent block of text and kept related paragraphs grouped under that heading — landing-ai-hierarchy-preserved-forest-study-area-text.png

Input artifact: Input artifact (Image): Scanned two-column research page with 'STUDY AREA' and 'STAND PRESCRIPTIONS' sections. — landing-ai-scanned-two-column-text-study-area.png

Output artifact: Output artifact (Image): On the scanned multi-column page, Landing AI OCR'd the 'STUDY AREA' section into a coherent block of text and kept related paragraphs grouped under that heading — landing-ai-hierarchy-preserved-forest-study-area-text.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Target annual report page titled 'A Growth Story Again' with a portrait and bullet points. — landing-ai-target-annual-report-growth-story-page.png

Observed output: Output artifact (Image): Landing AI preserved the page title, the main paragraph, the bullet list, and even added a description of the portrait image. The failure is semantic rather tha — landing-ai-target-annual-report-growth-story-parsed-hierarchy.png

Input artifact: Input artifact (Image): Target annual report page titled 'A Growth Story Again' with a portrait and bullet points. — landing-ai-target-annual-report-growth-story-page.png

Output artifact: Output artifact (Image): Landing AI preserved the page title, the main paragraph, the bullet list, and even added a description of the portrait image. The failure is semantic rather tha — landing-ai-target-annual-report-growth-story-parsed-hierarchy.png

What changed: Image transformed into Image

Why it matters / Conclusion: Good at keeping pages readable and in order across mixed PDFs, but not dependable if your downstream pipeline relies on exact heading levels.

Reconstructs page-level structure from both native and scanned PDFs into readable markdown-like text. This was exercised on a Target annual report section ('19. Commitments and Contingencies'), a Sumitomo financial report page with section/subsection headings, a scanned two-column research page headed 'STUDY AREA', and a full annual-report page with a portrait and bullet list.

image

Input artifact for "Document hierarchy and reading-order reconstruction" test: Target annual report section titled '19. Commitments and Contingencies' with the 'Data Breach' subsection., landing-ai-target-annual-report-commitments-contingencies-data-breach.png

Target annual report section titled '19. Commitments and Contingencies' with the 'Data Breach' subsection.

↓→

image

Output artifact for "Document hierarchy and reading-order reconstruction" test: Landing AI preserved the section heading, subsection title, and the full body paragraphs from the Target annual report section. In this example, the reading flo, landing-ai-parsed-commitments-contingencies-data-breach.png

Landing AI preserved the section heading, subsection title, and the full body paragraphs from the Target annual report section. In this example, the reading flow stayed intact: the 'Data Breach' heading remained attached to the surrounding narrative, including the payment-card, consumer class action, and financial institutions class action paragraphs.

image

Input artifact for "Document hierarchy and reading-order reconstruction" test: Sumitomo financial report page titled 'I. Summary of Operating Performance'., landing-ai-financial-report-operating-performance-page.png

Sumitomo financial report page titled 'I. Summary of Operating Performance'.

↓→

image

Output artifact for "Document hierarchy and reading-order reconstruction" test: On the table-heavy financial report, Landing AI kept the main section title, the subsection title, and the body paragraphs in readable order. The extracted outp, landing-ai-financial-report-operating-performance-parsed-hierarchy.png

On the table-heavy financial report, Landing AI kept the main section title, the subsection title, and the body paragraphs in readable order. The extracted output separates the summary heading from the paragraph text clearly enough to follow the original section organization.

image

Input artifact for "Document hierarchy and reading-order reconstruction" test: Scanned two-column research page with 'STUDY AREA' and 'STAND PRESCRIPTIONS' sections., landing-ai-scanned-two-column-text-study-area.png

Scanned two-column research page with 'STUDY AREA' and 'STAND PRESCRIPTIONS' sections.

↓→

image

Output artifact for "Document hierarchy and reading-order reconstruction" test: On the scanned multi-column page, Landing AI OCR'd the 'STUDY AREA' section into a coherent block of text and kept related paragraphs grouped under that heading, landing-ai-hierarchy-preserved-forest-study-area-text.png

On the scanned multi-column page, Landing AI OCR'd the 'STUDY AREA' section into a coherent block of text and kept related paragraphs grouped under that heading rather than interleaving the columns. The extraction is readable, though not letter-perfect: for example, one sentence shifts from 'growth factor during the season' to 'growth factor for the season,' and some punctuation/hyphenation is normalized.

image

Input artifact for "Document hierarchy and reading-order reconstruction" test: Target annual report page titled 'A Growth Story Again' with a portrait and bullet points., landing-ai-target-annual-report-growth-story-page.png

Target annual report page titled 'A Growth Story Again' with a portrait and bullet points.

↓→

image

Output artifact for "Document hierarchy and reading-order reconstruction" test: Landing AI preserved the page title, the main paragraph, the bullet list, and even added a description of the portrait image. The failure is semantic rather tha, landing-ai-target-annual-report-growth-story-parsed-hierarchy.png

Landing AI preserved the page title, the main paragraph, the bullet list, and even added a description of the portrait image. The failure is semantic rather than textual: the top-level heading is rendered as plain text instead of a true H1-style markdown heading, so the content survives but the hierarchy is weaker than the source.

Bottom Line

Good at keeping pages readable and in order across mixed PDFs, but not dependable if your downstream pipeline relies on exact heading levels.

Table reconstruction

Strong on clean financial tables, weaker on nested header semantics and noisier scanned tables.

▾

Test Summary

Feature tested: Table reconstruction

Result: Partial — Strong on clean financial tables, weaker on nested header semantics and noisier scanned tables.

Feature tested: Table reconstruction

Result: Partial

Verdict: Strong on clean financial tables, weaker on nested header semantics and noisier scanned tables.

Expected behavior: Extracts tables into structured markdown-like layouts that usually preserve rows, columns, and values. This was tested on the Target 'Financial Summary' table, a Sumitomo segment comparison table, a more complex multi-level segment table, and a photographed stand-structure table from the scanned research paper.

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Target annual report 'Financial Summary' table for 2015 through 2011. — landing-ai-target-annual-report-financial-summary-table-2.png

Observed output: Output artifact (Image): Landing AI reconstructed the Target financial summary with the year columns, row labels, and corresponding values still aligned. In this clean digital table, th — landing-ai-landingai-hybrid-earnings-pdf-parsed-table.png

Input artifact: Input artifact (Image): Target annual report 'Financial Summary' table for 2015 through 2011. — landing-ai-target-annual-report-financial-summary-table-2.png

Output artifact: Output artifact (Image): Landing AI reconstructed the Target financial summary with the year columns, row labels, and corresponding values still aligned. In this clean digital table, th — landing-ai-landingai-hybrid-earnings-pdf-parsed-table.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Segment results table comparing previous first quarter, present first quarter, and Y/Y change. — landing-ai-segment-results-table-2025-first-quarter.png

Observed output: Output artifact (Image): Landing AI preserved the segment table's core structure and values: business-segment rows such as Mechatronics, Industrial Machinery, Logistics & Construction, — landing-ai-segment-quarter-yoy-change-table.png

Input artifact: Input artifact (Image): Segment results table comparing previous first quarter, present first quarter, and Y/Y change. — landing-ai-segment-results-table-2025-first-quarter.png

Output artifact: Output artifact (Image): Landing AI preserved the segment table's core structure and values: business-segment rows such as Mechatronics, Industrial Machinery, Logistics & Construction, — landing-ai-segment-quarter-yoy-change-table.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Multi-level financial table with segment columns A through F3 and nested headers. — landing-ai-complex-financial-segment-table.png

Observed output: Output artifact (Image): Landing AI kept the values and the broad tabular shape of the complex segment table, but it flattened nested header relationships. The source distinguishes mult — landing-ai-parsed-multilevel-segment-table.png

Input artifact: Input artifact (Image): Multi-level financial table with segment columns A through F3 and nested headers. — landing-ai-complex-financial-segment-table.png

Output artifact: Output artifact (Image): Landing AI kept the values and the broad tabular shape of the complex segment table, but it flattened nested header relationships. The source distinguishes mult — landing-ai-parsed-multilevel-segment-table.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Photographed table titled 'Table 2.—Stand structure before and after cutting'. — landing-ai-stand-structure-before-after-cutting-table-2.png

Observed output: Output artifact (Image): On the scanned photographed table, Landing AI preserved the wide table layout and most row/column labels, but OCR introduced real numeric mistakes. Compared wit — landing-ai-lodgepole-pine-diameter-class-table.png

Input artifact: Input artifact (Image): Photographed table titled 'Table 2.—Stand structure before and after cutting'. — landing-ai-stand-structure-before-after-cutting-table-2.png

Output artifact: Output artifact (Image): On the scanned photographed table, Landing AI preserved the wide table layout and most row/column labels, but OCR introduced real numeric mistakes. Compared wit — landing-ai-lodgepole-pine-diameter-class-table.png

What changed: Image transformed into Image

Why it matters / Conclusion: A strong choice for born-digital financial tables, but scanned or photographed tables still need QA, especially when numeric accuracy matters.

Extracts tables into structured markdown-like layouts that usually preserve rows, columns, and values. This was tested on the Target 'Financial Summary' table, a Sumitomo segment comparison table, a more complex multi-level segment table, and a photographed stand-structure table from the scanned research paper.

image

Target annual report 'Financial Summary' table for 2015 through 2011.

↓→

image

Output artifact for "Table reconstruction" test: Landing AI reconstructed the Target financial summary with the year columns, row labels, and corresponding values still aligned. In this clean digital table, th, landing-ai-landingai-hybrid-earnings-pdf-parsed-table.png

Landing AI reconstructed the Target financial summary with the year columns, row labels, and corresponding values still aligned. In this clean digital table, the extracted layout remained easy to read, including rows like Sales, Cost of sales, SG&A, EBIT, and Net earnings/(loss).

image

Segment results table comparing previous first quarter, present first quarter, and Y/Y change.

↓→

image

Output artifact for "Table reconstruction" test: Landing AI preserved the segment table's core structure and values: business-segment rows such as Mechatronics, Industrial Machinery, Logistics & Construction,, landing-ai-segment-quarter-yoy-change-table.png

Landing AI preserved the segment table's core structure and values: business-segment rows such as Mechatronics, Industrial Machinery, Logistics & Construction, Energy & Lifelines, Others, and Total stayed aligned with prior-quarter, present-quarter, and Y/Y change columns. For this table, the output closely mirrors the original layout.

image

Multi-level financial table with segment columns A through F3 and nested headers.

↓→

image

Output artifact for "Table reconstruction" test: Landing AI kept the values and the broad tabular shape of the complex segment table, but it flattened nested header relationships. The source distinguishes mult, landing-ai-parsed-multilevel-segment-table.png

Landing AI kept the values and the broad tabular shape of the complex segment table, but it flattened nested header relationships. The source distinguishes multiple header levels such as Item, Segment, A-D, Subtotal, Other, Total, and adjustment columns; the parsed version compresses these into a simpler linear header row, which makes the data readable but weakens the semantic relationships between header levels.

image

Photographed table titled 'Table 2.—Stand structure before and after cutting'.

↓→

image

Output artifact for "Table reconstruction" test: On the scanned photographed table, Landing AI preserved the wide table layout and most row/column labels, but OCR introduced real numeric mistakes. Compared wit, landing-ai-lodgepole-pine-diameter-class-table.png

On the scanned photographed table, Landing AI preserved the wide table layout and most row/column labels, but OCR introduced real numeric mistakes. Compared with the source, the '12-inch cut' row shows 'Trees cut per acre' as 7.0 instead of 114.0, the 'Clearcut' after-cut row shows a stray 9 where the source shows 0, and the 'Check area' before-cut values are distorted, with 255.0 appearing where the source shows 55.0 in that position. This makes the table inspectable, but not trustworthy without verification.

Bottom Line

A strong choice for born-digital financial tables, but scanned or photographed tables still need QA, especially when numeric accuracy matters.

Chart-to-text conversion

Consistently converts charts into detailed textual representations instead of dropping them.

▾

Test Summary

Feature tested: Chart-to-text conversion

Result: Passed — Consistently converts charts into detailed textual representations instead of dropping them.

Feature tested: Chart-to-text conversion

Result: Passed

Verdict: Consistently converts charts into detailed textual representations instead of dropping them.

Expected behavior: Transforms charts into descriptive text blocks that retain titles, series/category labels, approximate values, and trend direction. This was tested on a SG&A waterfall chart from the hybrid earnings report and a scanned bar chart showing tree mortality by year and cut treatment.

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Waterfall chart titled 'Selling, General and Administrative Expense Rate'. — landing-ai-sg-and-a-expense-rate-waterfall-chart-1.png

Observed output: Output artifact (Image): Instead of keeping the original waterfall image, Landing AI translated it into a text summary that retained the chart title, year anchors, category labels, and — landing-ai-parsed-sg-and-a-expense-rate-summary.png

Input artifact: Input artifact (Image): Waterfall chart titled 'Selling, General and Administrative Expense Rate'. — landing-ai-sg-and-a-expense-rate-waterfall-chart-1.png

Output artifact: Output artifact (Image): Instead of keeping the original waterfall image, Landing AI translated it into a text summary that retained the chart title, year anchors, category labels, and — landing-ai-parsed-sg-and-a-expense-rate-summary.png

What changed: Image transformed into Image

Test case: Image → Image

Input type: Image

Input used: Input artifact (Image): Scanned bar chart of tree mortality by year and cut treatment. — landing-ai-tree-mortality-by-year-and-cut-bar-chart-1.png

Observed output: Output artifact (Image): Landing AI converted the scanned bar chart into a structured text description with the legend, year groupings, approximate per-series values, and the vertical d — landing-ai-parsed-tree-mortality-by-year-and-cut-chart.png

Input artifact: Input artifact (Image): Scanned bar chart of tree mortality by year and cut treatment. — landing-ai-tree-mortality-by-year-and-cut-bar-chart-1.png

Output artifact: Output artifact (Image): Landing AI converted the scanned bar chart into a structured text description with the legend, year groupings, approximate per-series values, and the vertical d — landing-ai-parsed-tree-mortality-by-year-and-cut-chart.png

What changed: Image transformed into Image

Why it matters / Conclusion: If your priority is keeping chart information in the markdown rather than preserving the original visual, this is one of Landing AI's clearest strengths in the report.

Transforms charts into descriptive text blocks that retain titles, series/category labels, approximate values, and trend direction. This was tested on a SG&A waterfall chart from the hybrid earnings report and a scanned bar chart showing tree mortality by year and cut treatment.

image

Waterfall chart titled 'Selling, General and Administrative Expense Rate'.

↓→

image

Instead of keeping the original waterfall image, Landing AI translated it into a text summary that retained the chart title, year anchors, category labels, and the increase/decrease direction for each step. The output captures specific values such as 2013 SG&A Rate 20.2%, Cost Saving Initiatives (0.8)%, Technology 0.2%, Other 0.4%, 2014 SG&A Rate 20.0%, and 2015 SG&A Rate 19.6%.

image

Scanned bar chart of tree mortality by year and cut treatment.

↓→

image

Output artifact for "Chart-to-text conversion" test: Landing AI converted the scanned bar chart into a structured text description with the legend, year groupings, approximate per-series values, and the vertical d, landing-ai-parsed-tree-mortality-by-year-and-cut-chart.png

Landing AI converted the scanned bar chart into a structured text description with the legend, year groupings, approximate per-series values, and the vertical dashed 'CUT COMPLETED' marker. The output preserves the key analytical content, such as the high 1980 value for the check area and the treatment-by-treatment comparisons across 1979, 1980, and 1981.

Bottom Line

If your priority is keeping chart information in the markdown rather than preserving the original visual, this is one of Landing AI's clearest strengths in the report.

Signature and attestation region extraction

Captures signature regions as semantic attestations rather than dropping them.

▾

Test Summary

Feature tested: Signature and attestation region extraction

Result: Passed — Captures signature regions as semantic attestations rather than dropping them.

Represents signature-heavy document regions as attestation-style elements that preserve the presence, role, and apparent legibility of signatures. This was exercised on the signatures page from the Target annual report.

image

Target annual report signatures page.

↓→

image

Output artifact for "Signature and attestation region extraction" test: Landing AI did more than extract nearby text from the signature page: it emitted attestation blocks describing the signature regions themselves. In this example, landing-ai-target-earnings-signatures-parsed-text.png

Landing AI did more than extract nearby text from the signature page: it emitted attestation blocks describing the signature regions themselves. In this example, it preserved the Target Corporation attestation for Catherine R. Smith and a second attestation for Brian C. Cornell, including whether the signature appeared legible or illegible and where it sat relative to the signature line and printed title.

Bottom Line

Useful for compliance-style documents where the existence of a signature block matters, even if you do not need handwriting recognition.

Pricing & Access

TESTED

Free

1000 credits available on signup

Pay-as-you-go

$1 for 100 credits

Parse Field extraction Visual grounding Document splitting & classification Multilingual documents

Team

$250/month

27.5k credits/month Team management and shared usage Email support Zero data retention available HIPAA-compliant processing with BAA agreement available

Custom

Custom Pricing

Everything in Team, Plus: SaaS, VPC, and on-prem deployments Custom processing pipeline SLAs and uptime guarantees Priority rate limits Snowflake integration support

Is This Right For You?

A side-by-side guide based on our hands-on testing.

✓ Use This If

●You need a hosted API that returns markdown automatically from hybrid PDFs with native text, tables, charts, and scanned pages.

●You care most about readable financial-table extraction from born-digital reports.

●You want charts preserved as text summaries with values and labels instead of being dropped.

●You need signature pages represented semantically in the output rather than ignored.

✕ Skip This If

●You need exact markdown heading semantics, because the report shows top-level headings being flattened into plain text.

●You need perfect preservation of nested table headers, because complex multi-level headers were compressed in at least one financial table.

●You need high-trust numeric OCR from photographed or scanned tables, because the scanned stand-structure table contained value errors.

●You need proven multilingual or degraded-scan performance, because those scenarios were not tested in this report.

Developer Tools & APIsAPIstext

In this research, Landing AI accepted each tested PDF and produced a parsed markdown file through a fully automated API flow. The report explicitly notes no manual correction, UI interaction, or post-processing was required to get the markdown output.

It performed well on clean financial tables. The Target 'Financial Summary' table and the Sumitomo segment comparison table were reconstructed with readable rows, columns, and values. The main limitation showed up on more complex cases: nested header levels were flattened in one multi-level financial table, and a scanned photographed table introduced numeric OCR errors.

Yes, but mainly by converting them into text rather than preserving the original chart image as-is. In the hybrid earnings report, it turned a SG&A waterfall chart into a text summary with the chart title, labels, values, and increase/decrease direction. In the scanned research paper, it described a bar chart with legend entries, approximate values, and the 'CUT COMPLETED' marker.

Partly yes. On the scanned research paper, it OCR'd a two-column 'STUDY AREA' page into coherent text and kept the section content grouped logically. However, scanned-table accuracy was weaker, and the report also notes that opening-page structure on the scanned document was misinterpreted.

Mixed. It preserved useful section flow on several interior pages, including the Target commitments section and a Sumitomo operating-performance page. But it did not consistently retain heading semantics: the report shows at least one top-level heading flattened into plain text, and major headings were described as inconsistently distinguished in parts of the financial report.

It does not just ignore them. On the Target signatures page, Landing AI generated attestation-style output that described the presence of signature regions, the associated signer names and titles, and whether the signature appeared legible or illegible.

No. Although multilingual PDFs were part of the broader research plan, this Landing AI section only documents testing on a hybrid earnings report, a table-heavy financial report, and a scanned English research paper.

No. The provided report covers API setup, documentation, and output behavior, but it does not state Landing AI pricing or plan details.