Word to PDF converts .docx files into PDF entirely in your browser, with no server involvement. The tool ships two conversion pipelines internally and chooses based on the document's features. Both keep the file in tab memory the whole time — nothing is uploaded.
The direct OOXML pipeline parses the .docx ZIP archive (DOCX is XML-in-a-ZIP), reads the OOXML structure with a custom parser, runs deterministic layout, and writes PDF primitives via pdf-lib. This route preserves text as text (selectable, searchable, copy-paste-able), keeps fonts as embedded subsets, and produces a small, vector-clean PDF roughly 50–500 KB for a typical document. It is the preferred path when the document uses standard paragraphs, headings, lists, tables, hyperlinks, and simple images.
The fallback HTML render pipeline uses mammoth (DOCX → semantic HTML), renders the HTML inside an isolated iframe at 816×1056 pixels (US Letter at 96 DPI), captures each page with html2canvas, and embeds the resulting JPEGs into a PDF using pdf-lib. This route activates when the direct pipeline detects unsupported features. The output is a raster PDF — every page is an image, so text is no longer selectable and file size is significantly larger.
The conversion tries to preserve heading levels, paragraph styles, bold/italic/underline runs, ordered and unordered lists, simple tables, inline images (PNG/JPEG embedded as PDF XObjects), and hyperlinks. It also detects title-like centered content via heuristics and applies the appropriate alignment.
Realistic limits: the .doc legacy binary format (Word 97–2003) is not supported — only .docx. Documents with complex floating shapes, equation editor (OMath), embedded charts, footnotes that span pages, multi-column layouts, headers/footers with field codes (PAGE OF NUMPAGES), and SmartArt may render incorrectly or fall back to the raster pipeline. For those documents, the most reliable conversion is still Microsoft Word's built-in Save As PDF or LibreOffice in headless mode.
Image fidelity in the direct pipeline matches the source — embedded images are pulled from the .docx ZIP and re-embedded into the PDF without re-encoding, so quality is preserved. The HTML pipeline rasterizes everything at 96 DPI, which is fine for screen reading but soft when zoomed or printed at full Letter size.
Browser memory is the practical limit. .docx files up to ~30 MB and 200 pages convert comfortably. Documents with hundreds of high-resolution embedded images can pressure the tab's heap during the html2canvas pass; in those cases, the direct pipeline is dramatically lighter and usually succeeds where the raster pipeline fails.