Frequently asked questions
Why convert a PDF to Markdown before sending it to an LLM?
PDFs encode a lot of layout metadata that becomes noise to a language model. Converting to Markdown strips that overhead, preserves semantic structure (headings, lists, tables) and typically cuts token usage by 20–30x on text-heavy documents. Lower cost, better extraction.
Is my PDF uploaded to a server?
No. Conversion runs entirely in your browser using client-side parsing. Your file never leaves your machine, which matters for NDAs, clinical documents, board packs and anything under GDPR.
Does it work on scanned PDFs?
Not reliably. OCR is not included, so scanned or image-only PDFs will not extract clean text. Run them through a dedicated OCR tool (Adobe Acrobat, Tesseract, or a cloud OCR service) first, then feed the resulting text PDF here.
Is there a file size limit?
Only the one imposed by your browser's memory budget. In practice, ~100MB text PDFs work on most machines. For very large regulatory submissions, split the document first.
Still have a question, or need a variant for your specific regulatory context? Ask us directly, or read our Insights for longer writeups.