Outilo Outilo

PDF to Markdown converter for AI

Convert your PDFs into clean Markdown, ready to paste into ChatGPT, Claude or your AI agents. Rebuilds headings, tables and lists, local OCR for scanned pages, image extraction — all in your browser, with no server upload.

Drop your PDF file here

Or click to open it

Cleaning settings

No document being processed

Select or drag a PDF file to start local geometric and text extraction.

File waiting for analysis

The file is loaded successfully and ready to be processed.

Click Start extraction in the left panel to start the local conversion.

Geometric extraction of line structures and table grids

Analysis complete

Pages
Images
Tokens

          
Insert at top:
Editor synced with the final ZIP download
Native text extraction
Pages processed by OCR

PDF logical metadata

Title:
Author:
Subject:
Producer:

Document structure & outline

Technical log

Local processing complete: Advanced features analysed line alignment, character styles, physical column splitting and hyperlinks.

Download the full ZIP archive

Contains your final Markdown document, the isolated images and the JSON processing report.

Edited by Outilo Reviewed by the Outilo team Last verified on 12/06/2026

Go deeper

Why convert a PDF to Markdown for AI?

PDF files are built to freeze a fixed layout, which makes extraction hard for large language models (LLMs). Pasting them as-is often breaks sentences and destroys tables. Markdown translates your document into a tagged language that AIs like ChatGPT or Claude understand natively: headings, lists and tables are preserved, helping the model grasp the document's logical structure without wasting attention tokens.

100% local and private processing

The whole process runs in your browser thanks to pdf.js and Tesseract.js (WebAssembly). Not a single byte of your document is sent to a server: your contracts, quotes or internal documents stay private. It's ideal for sensitive files you do not want to upload to an online service.

Full privacy

No file goes through a server. The analysis happens in your browser's memory.

Tables & structure

Aligned columns are rebuilt into native Markdown tables, headings and lists included.

OCR for scans

Scanned pages or images are read by local OCR and converted into usable text.

Optimised for your prompts and agents

The result estimates the number of tokens consumed and lets you insert an AI instruction (summary, technical analysis, rewrite) directly at the top of the document. You then download a ZIP archive containing the Markdown file, the extracted images and a processing report, ready to drop into your favourite tool.

FAQ

Are my PDF files uploaded to a server?

No. The conversion runs entirely in your browser using the pdf.js and Tesseract.js libraries compiled to WebAssembly. Not a single byte of your document leaves your computer, which guarantees the privacy of your sensitive files.

Does the tool handle scanned or image-based PDFs?

Yes. When a page contains little or no selectable text, you can enable automatic OCR: Tesseract.js recognises the text directly from the rendered image, in French, English, Spanish, German or Italian.

Are tables converted correctly?

Our engine analyses the horizontal distances between text blocks to rebuild columns and produce native Markdown tables, including merged-cell handling. Technical sheets and pricing grids therefore stay readable.

What does the downloaded ZIP archive contain?

The ZIP bundles your final Markdown document, an "images" folder with the extracted illustrations and a JSON report detailing the number of pages, the OCR-processed pages and the estimated AI tokens.

Related guides

Related tools