You Should Use Pandoc

You Should Use Pandoc

Jul 11, 2024

Pandoc bills itself as ā€œa universal document converter.ā€ Thatā€™s what it is. You can find installation instructions here.

To be more specific, Pandoc is a command-line program (i.e.Ā you run it in the terminal) for converting documents between various file formats. Hereā€™s some things I use Pandoc for:

As you can see, I prefer to use Markdown as a source format and convert to other formats, but Markdown can go both ways for most of the supported formats.

For people who write documents

(i.e., everyone.)

Without Pandoc, you have basically two options for writing serious documents that become PDFs or another typeset format:

  1. Write it in LaTeX and use a LaTeX engine to generate a PDF
  2. Write it in a word processor like Microsoft Word, Apple Pages, or Google Docs, and print a PDF

These are both fine most of the time. But if you happen to be a STEM student who needs to typeset mathematical or scientific writing, you pretty much have to use LaTeX. LaTeX is still a fine option but itā€™s pretty ugly, archaic, and limited. What if you want to generate an HTML file for a blog post as well as a PDF? Itā€™s pretty awkward to do so in LaTeX alone. Pandoc enables this: use LaTeX as a source format and HTML as an output. It also allows you to write in Markdown and generate LaTeXā€¦ or even skip the middleman and use your LaTeX engine to generate a PDF in one step. I find that Markdown is a better source format most of the time you are dealing with technical content, but even when it isnā€™t, Pandoc is still a great tool. Pandoc lets you write in a number of nicer formats and still generate a beautiful typeset PDF.

How to use

You can try it online here or you can download it for your system and then run:

pandoc --from markdown --to html5 article.md

where article.md is the filename of your Markdown source. This will generate a article.html file.

Or, to generate a PDF, assuming you have LaTeX already installed, try this:

pandoc --from markdown --to html5 article.md

Maybe you donā€™t like the look of the output. Indeed I think the defaults are not great. Peruse the many options for more precise control over the generated document. Yes, Iā€™m telling you to RTFM.

ā€¦or you can try my command and see if you like it.

Some defaults

Hereā€™s what I used to generate the PDF of my short stories for my writing workshop, using a Markdown source.

Put this YAML header at the top of your Markdown file (or figure out how to put these directives in the command):

title: My Title
author: My Name
date: 11 July 2024
documentclass: article
fontfamily: mathptmx
linestretch: 2
indent: 4m
pointsize: 14p
geometry:
- margin=1in
header-includes:
- \renewcommand{\rule}[2]{\begin{center} * * * \end{center}}

Then generate the PDF using this command (assumes you have pdflatex; you could also use xelatex or lualatex but I cannot vouch that my YAML will work with those):

pandoc article.md -f markdown-latex_macros -t pdf --pdf-engine=pdflatex -o article.pdf

That should get you started. Now play with it!

Markdown caveats

Markdown has many different ā€œflavors,ā€ which is a euphemism for saying nobody can agree on what syntactic forms are permitted in Markdown. Every Markdown to HTML converter obeys different rules. Fortunately, Pandoc supports pretty much everything through its extensions. Most of the time Pandoc will be tolerant, but you might have to manually enable features like special syntax for tables. Again, RTFM.

For developers

Pandoc is also a Haskell library. You can do some fabulous things with it if you know Haskell.


  1. This page was written in Markdown and converted to HTML using Pandoc.ā†©ļøŽ