---
title: "Upstage AI"
type: "AI Tool"
url: "https://aidemos.com/tools/upstage-ai"
description: "Solid on native financial tables, but unreliable for multi-column and scanned-document structure in markdown conversion."
category: "text"
website: "https://www.upstage.ai?via=aidemos"
authors:
  - "Mahreen Fathima"
published: "2026-06-12T07:12:50.420Z"
updated: "2026-06-16T11:27:38.357Z"
---

# Upstage AI

Solid on native financial tables, but unreliable for multi-column and scanned-document structure in markdown conversion.

`3 PDFs tested` · `Strong native tables` · `Chart values extracted` · `Weak multi-column layout`

**Website:** [Visit Upstage AI](https://www.upstage.ai?via=aidemos)

> **Mixed result for complex PDF-to-markdown work**
>
> Upstage AI handled the API workflow cleanly and did its best work on native financial tables, where row/value placement stayed mostly intact. It also converted charts into text summaries with extracted values instead of dropping them outright. But across the broader use case, it was inconsistent: multi-column pages lost hierarchy, scanned signature pages flattened badly, and chart/table structure became less trustworthy once layouts got harder.

## Demo Recording

[Video: Upstage AI demo recording](https://d3epheqghktydj.cloudfront.net/upstage-ai-upstage-hybrid-pdf-tool-demo-2.mp4)
*Video — Hybrid earnings report conversion walkthrough.*

## Feature-by-Feature Breakdown

### API-based PDF-to-markdown conversion

**Verdict:** Reliable ingestion and export across the three tested PDFs.

Upstage accepted all three tested documents through an automated API workflow and returned downloadable markdown outputs: an 84-page hybrid earnings report, an 18-page table-heavy financial report, and a scanned research report. The researcher did not need manual correction, UI interaction, or post-processing to obtain the markdown files.

**Input:**

[Pdf: llamaparse-hybrid-earnings-pdf-1.pdf](https://d3epheqghktydj.cloudfront.net/llamaparse-hybrid-earnings-pdf-1.pdf)

**Output:**

[Markdown: upstage-ai-upstage-hybrid-earningspdf-output-1.md](https://d3epheqghktydj.cloudfront.net/upstage-ai-upstage-hybrid-earningspdf-output-1.md)

**Input:**

[Pdf: llamaparse-sumitomo-financial-pdf-1.pdf](https://d3epheqghktydj.cloudfront.net/llamaparse-sumitomo-financial-pdf-1.pdf)

**Output:**

[Markdown: upstage-ai-upstage-financialpdf-parsed.md](https://d3epheqghktydj.cloudfront.net/upstage-ai-upstage-financialpdf-parsed.md)

**Input:**

[Pdf: llamaparse-scanned-research-pdf-1.pdf](https://d3epheqghktydj.cloudfront.net/llamaparse-scanned-research-pdf-1.pdf)

**Output:**

[Markdown: upstage-ai-upstage-scannedpdf-output.md](https://d3epheqghktydj.cloudfront.net/upstage-ai-upstage-scannedpdf-output.md)

**Bottom line:** If your first question is simply whether the service will accept varied PDFs and give you markdown back through an API, Upstage passed that baseline cleanly in all three tests.

### Table reconstruction

**Verdict:** Strong on cleaner native tables; weaker on harder headers and scanned layouts.

Upstage reconstructs tables into readable markdown-like structure, but quality depends heavily on source complexity. It preserved the Target financial summary table from the hybrid earnings report with strong row/column fidelity and only minor symbol loss. On the Sumitomo quarterly balance sheet and the scanned forestry table, body values survived better than header structure, and grouped headers became misaligned or duplicated.

**Input:**

![landing-ai-target-annual-report-financial-summary-table-2.png](https://d3epheqghktydj.cloudfront.net/landing-ai-target-annual-report-financial-summary-table-2.png)
*Image: landing-ai-target-annual-report-financial-summary-table-2.png*

**Output:**

![upstage-ai-target-2015-financial-results-parsed-table.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-target-2015-financial-results-parsed-table.png)
*Image: upstage-ai-target-2015-financial-results-parsed-table.png*

**Input:**

![upstage-ai-sumitomo-quarterly-consolidated-balance-sheets.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-sumitomo-quarterly-consolidated-balance-sheets.png)
*Image: upstage-ai-sumitomo-quarterly-consolidated-balance-sheets.png*

**Output:**

![upstage-ai-parsed-balance-sheet.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-parsed-balance-sheet.png)
*Image: upstage-ai-parsed-balance-sheet.png*

**Input:**

![mistral-ai-scanned-treatment-diameter-table.png](https://d3epheqghktydj.cloudfront.net/mistral-ai-scanned-treatment-diameter-table.png)
*Image: mistral-ai-scanned-treatment-diameter-table.png*

**Output:**

![upstage-ai-parsed-diameter-treatment-table.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-parsed-diameter-treatment-table.png)
*Image: upstage-ai-parsed-diameter-treatment-table.png*

**Bottom line:** Upstage is credible for readable extraction of simpler native financial tables, but once headers get more complex or the source is scanned, the markdown table structure becomes much less dependable.

### Chart and figure extraction to text

**Verdict:** Better than dropping charts, but not clean enough to count as faithful chart preservation.

Upstage converts charts and figures into prose summaries plus extracted values instead of discarding them. In the hybrid earnings report, it turned an SG&A waterfall chart into a narrative explanation and category/value list. In the scanned research report, it summarized a multi-series line chart and produced year-by-year values. The tradeoff is organization: the recovered data was described as raw or poorly structured rather than preserved in a clean, chart-like form.

**Input:**

![llamaparse-sga-rate-waterfall-chart-1.png](https://d3epheqghktydj.cloudfront.net/llamaparse-sga-rate-waterfall-chart-1.png)
*Image: llamaparse-sga-rate-waterfall-chart-1.png*

**Output:**

![upstage-ai-sgaa-rate-waterfall-text-description.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-sgaa-rate-waterfall-text-description.png)
*Image: upstage-ai-sgaa-rate-waterfall-text-description.png*

**Input:**

![upstage-ai-figure-3-average-radial-growth-line-chart.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-figure-3-average-radial-growth-line-chart.png)
*Image: upstage-ai-figure-3-average-radial-growth-line-chart.png*

**Output:**

![upstage-ai-parsed-figure-3-radial-growth-summary.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-parsed-figure-3-radial-growth-summary.png)
*Image: upstage-ai-parsed-figure-3-radial-growth-summary.png*

**Bottom line:** Upstage does retain chart information in text form, which is better than a silent drop, but the output is still too loosely structured for users who need faithful markdown representations of figures.

### Reading order and document hierarchy preservation

**Verdict:** Inconsistent on digital sections and weak on multi-column pages.

Upstage preserved basic narrative flow on at least one straightforward digital section, but it struggled to keep hierarchy and navigation intact on harder layouts. The Mechatronics/Industrial Machinery section from the Sumitomo report remained readable with subsection ordering preserved. By contrast, the hybrid earnings report's two-column strategy page blurred headings into surrounding prose, the operating-performance page lost clear heading distinction, and the researcher reported paragraph segmentation and reading-order breakdown on the scanned multi-column research paper.

**Input:**

![upstage-ai-financial-section-bulletins-mechatronics-industrial-machinery.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-financial-section-bulletins-mechatronics-industrial-machinery.png)
*Image: upstage-ai-financial-section-bulletins-mechatronics-industrial-machinery.png*

**Output:**

![upstage-ai-parsed-financial-mechatronics-industrial-machinery-dark.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-parsed-financial-mechatronics-industrial-machinery-dark.png)
*Image: upstage-ai-parsed-financial-mechatronics-industrial-machinery-dark.png*

**Input:**

![upstage-ai-target-two-column-narrative-with-highlighted-section.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-target-two-column-narrative-with-highlighted-section.png)
*Image: upstage-ai-target-two-column-narrative-with-highlighted-section.png*

**Output:**

![upstage-ai-target-earnings-parsed-strategy-and-merchandising.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-target-earnings-parsed-strategy-and-merchandising.png)
*Image: upstage-ai-target-earnings-parsed-strategy-and-merchandising.png*

**Input:**

![upstage-ai-summary-operating-performance-quarterly-results.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-summary-operating-performance-quarterly-results.png)
*Image: upstage-ai-summary-operating-performance-quarterly-results.png*

**Output:**

![upstage-ai-operating-performance-summary-annotated-callouts.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-operating-performance-summary-annotated-callouts.png)
*Image: upstage-ai-operating-performance-summary-annotated-callouts.png*

**Input:**

![landing-ai-scanned-two-column-text-study-area.png](https://d3epheqghktydj.cloudfront.net/landing-ai-scanned-two-column-text-study-area.png)
*Image: landing-ai-scanned-two-column-text-study-area.png*

**Output:**

![upstage_scannedpdf_parsed_hierarchy.png](https://d3epheqghktydj.cloudfront.net/upstage_scannedpdf_parsed_hierarchy.png)
*Image: upstage_scannedpdf_parsed_hierarchy.png*

**Bottom line:** Upstage can preserve straightforward section flow, but it was not reliable enough on multi-column or hierarchy-sensitive pages to trust for full-document markdown fidelity.

### Scanned-page OCR for printed text

**Verdict:** Printed text was partly recovered, but signature blocks and page structure were not faithfully preserved.

Upstage can OCR printed text on scanned pages inside a mixed PDF, but the result is much less faithful once signatures and local structure matter. The tested example was the Target signatures page from the hybrid earnings report: surrounding printed text, date, and names were retained, but the handwritten signature itself was not meaningfully preserved and the section collapsed into a flat block.

**Input:**

![landing-ai-target-annual-report-signatures-page-2.png](https://d3epheqghktydj.cloudfront.net/landing-ai-target-annual-report-signatures-page-2.png)
*Image: landing-ai-target-annual-report-signatures-page-2.png*

**Output:**

![upstage-ai-target-signatures-ocr-extraction-1.png](https://d3epheqghktydj.cloudfront.net/upstage-ai-target-signatures-ocr-extraction-1.png)
*Image: upstage-ai-target-signatures-ocr-extraction-1.png*

**Bottom line:** For scanned pages with ordinary printed text, Upstage can recover usable content, but it is not a good fit when the exact structure of signature sections or handwriting-adjacent content matters.

## Pricing & Access

| Plan | Price | Notes |
| --- | --- | --- |
| Free (tested) | $0 | Upstage Studio offers free testing based on 10 runs per agent |
| Standard | $0.01 / Pages |  |
| Enhanced | $0.03 / Pages |  |

## Is This Right For You?

A side-by-side guide based on our hands-on testing.

**✓ Use This If**
- You mainly need an API that will accept varied PDFs and return markdown files automatically.
- Your documents are table-heavy financial PDFs where readable row/value reconstruction matters more than perfect layout fidelity.
- You can live with charts being converted into text summaries and extracted values instead of preserved visual structure.

**✕ Skip This If**
- You need reliable reading order and hierarchy preservation on multi-column pages.
- You need scanned pages to stay structurally faithful, especially around signature blocks.
- You need complex table headers or chart outputs to come back in clean, confidently structured markdown without manual review.

## Use Case Track

| Rank | Use Case | Notes |
| --- | --- | --- |
| #7 | Convert a Complex PDF to Clean Markdown with API | Solid on native financial tables, but unreliable for multi-column and scanned-document structure in markdown conversion. |

## Related Pages

- [Best AI APIs to Convert Complex PDFs into Clean Markdown](https://aidemos.com/best/pdf-to-markdown-apis) — Ranking

## Related Reads

- **Best AI Tools to Convert Complex PDFs into Clean Markdown with an API** — RANKING

## Classification

- **Type:** text
- **Built for:** Founders

## Frequently Asked Questions

**Q: Can Upstage AI convert a mixed PDF to markdown through an API?**

Yes. In this research it accepted three different PDFs through an automated workflow: an 84-page hybrid earnings report, an 18-page table-heavy financial report, and a scanned research report. Each test produced a downloadable markdown file without a manual correction step.

**Q: How good is Upstage AI at extracting tables from PDFs?**

It performed best on the native Target financial summary table, where rows, columns, and values stayed mostly aligned and only some currency symbols were missed. It was weaker on harder tables: the Sumitomo balance sheet had header/data misalignment, and the scanned forestry table duplicated and split grouped headers awkwardly.

**Q: Does Upstage AI preserve charts in markdown output?**

Partially. It did not simply drop the tested charts. Instead, it converted them into prose descriptions plus extracted values, including a waterfall chart and a scanned line chart. The downside is that the recovered chart data was not cleanly structured enough to preserve the original visual organization.

**Q: How does Upstage AI handle scanned signature pages?**

It recovered much of the printed text on the tested Target signatures page, including dates and printed names, but it did not preserve the signature structure well. Handwritten signatures were not clearly identifiable, and the whole section flattened into a block of text.

**Q: Is Upstage AI good for multi-column PDFs?**

Not reliably. In the hybrid earnings report, multi-column strategy sections lost clear heading separation, and the researcher also reported paragraph segmentation and reading-order problems on a scanned multi-column research page.

**Q: Does Upstage AI keep document headings and section hierarchy intact?**

Only inconsistently. It preserved hierarchy reasonably well on the Mechatronics / Industrial Machinery section of the Sumitomo report, but on other pages headings were reduced to body-text-like output and no longer clearly separated from the content they introduced.

## Similar Tools

AI tools similar to Upstage AI:

- [LlamaParse](https://aidemos.com/tools/llamaparse) — LlamaParse Review: AI Resume Parser & Schema Extraction Tested (2026)
- [Landing AI](https://aidemos.com/tools/landing-ai) — A capable PDF-to-markdown API for complex financial and scanned PDFs, with strong table and chart extraction but inconsistent heading semantics.
- [Mistral AI](https://aidemos.com/tools/mistral-ai) — A strong hosted PDF-to-markdown API for mixed and scanned documents, with solid OCR, table recovery, and asset export but uneven structural fidelity.
- [Nutrient.io](https://aidemos.com/tools/nutrient-io) — A developer-first PDF-to-markdown API that handles straightforward OCR and hierarchy well, but loses fidelity on complex tables, charts, and handwritten visual content.
- [Extend AI](https://aidemos.com/tools/extend-ai) — A capable PDF-to-markdown API for mixed and scanned documents that keeps structure and most visuals, but stumbles on the hardest table headers.
