# pypandoc [![Build Status](https://github.com/JessicaTegner/pypandoc/actions/workflows/ci.yaml/badge.svg)](https://github.com/JessicaTegner/pypandoc/actions/workflows/ci.yaml) [![GitHub Releases](https://img.shields.io/github/tag/JessicaTegner/pypandoc.svg)](https://github.com/JessicaTegner/pypandoc/releases) [![Pypandoc PyPI Version](https://img.shields.io/pypi/v/pypandoc?label=pypandoc+pypi+version)](https://pypi.org/project/pypandoc/) [![Pypandoc Binary PyPI Version](https://img.shields.io/pypi/v/pypandoc_binary?label=pypandoc+binary+pypi+version)](https://pypi.org/project/pypandoc_binary/) ![PyPandoc PyPi Downloads](https://img.shields.io/pypi/dm/pypandoc) ![PyPandoc Binary PyPI Downloads](https://img.shields.io/pypi/dm/pypandoc_binary) [![conda version](https://anaconda.org/conda-forge/pypandoc/badges/version.svg)](https://anaconda.org/conda-forge/pypandoc/) [![Development Status](https://img.shields.io/pypi/status/pypandoc.svg)](https://pypi.python.org/pypi/pypandoc/) [![PyPandoc Python version](https://img.shields.io/pypi/pyversions/pypandoc.svg)](https://pypi.python.org/pypi/pypandoc/) [![PyPandoc Binary Python version](https://img.shields.io/pypi/pyversions/pypandoc_binary.svg)](https://pypi.python.org/pypi/pypandoc_binary/) ![License](https://img.shields.io/pypi/l/pypandoc.svg) Pypandoc provides a thin wrapper for [pandoc](https://pandoc.org), a universal document converter. ## Installation Pypandoc uses pandoc, so it needs an available installation of pandoc. Pypandoc provides 2 packages, "pypandoc" and "pypandoc_binary", with the second one including pandoc out of the box. The 2 packages are identical, with the only difference being that one includes pandoc, while the other don't. If pandoc is already installed (i.e. pandoc is in the `PATH`), pypandoc uses the version with the higher version number, and if both are the same, the already installed version. See [Specifying the location of pandoc binaries](#specifying-the-location-of-pandoc-binaries) for more. To use pandoc filters, you must have the relevant filters installed on your machine. ### Installing via pip If you [want to install pandoc yourself](#Installing-pandoc) or are on a unsupported platform, you'll need to install "pypandoc" and [install pandoc manually](#Installing-pandoc) ``` pip install pypandoc ``` If you want pandoc included out of the box, you can utilize our pypandoc_binary package, which are identical to the "pypandoc" package, but with pandoc included. ``` pip install pypandoc_binary ``` Prebuilt [wheels for Windows and Mac OS X](https://pypi.python.org/pypi/pypandoc_binary/) If you use Linux and have [your own wheelhouse](https://wheel.readthedocs.org/en/latest/#usage), you can build a wheel which include pandoc with `python setup_binary.py download_pandoc; python setup.py bdist_wheel`. Be aware that this works only on 64bit intel systems, as we only download it from the [official releases](https://github.com/jgm/pandoc/releases). ### Installing via conda Pypandoc is included in [conda-forge](https://conda-forge.github.io/). The conda packages will also install the pandoc package, so pandoc is available in the installation. Install via `conda install -c conda-forge pypandoc`. You can also add the channel to your conda config via `conda config --add channels conda-forge`. This makes it possible to use `conda install pypandoc` directly and also lets you update via `conda update pypandoc`. ### Installing pandoc If you don't already have pandoc on your system, or have installed the pypandoc_binary package, which includes pandoc, you need to install pandoc by yourself. #### Installing pandoc via pypandoc Installing via pypandoc is possible on Windows, Mac OS X or Linux (Intel-based, 64-bit): ```python pip install pypandoc from pypandoc.pandoc_download import download_pandoc # see the documentation how to customize the installation path # but be aware that you then need to include it in the `PATH` download_pandoc() ``` The default install location is included in the search path for pandoc, so you don't need to add it to the `PATH`. By default, the latest pandoc version is installed. If you want to specify your own version, say 1.19.1, use `download_pandoc(version='1.19.1')` instead. #### Installing pandoc manually Installing manually via the system mechanism is also possible. Such installation mechanism make pandoc available on many more platforms: - Ubuntu/Debian: `sudo apt-get install pandoc` - Fedora/Red Hat: `sudo yum install pandoc` - Arch: `sudo pacman -S pandoc` - Mac OS X with Homebrew: `brew install pandoc pandoc-citeproc Caskroom/cask/mactex` - Machine with Haskell: `cabal-install pandoc` - Windows: There is an installer available [here](https://pandoc.org/installing.html) - [FreeBSD with pkg:](https://www.freshports.org/textproc/hs-pandoc/) `pkg install hs-pandoc` - Or see [Pandoc - Installing pandoc](https://pandoc.org/installing.html) Be aware that not all install mechanisms put pandoc in the `PATH`, so you either have to change the `PATH` yourself or set the full `PATH` to pandoc in `PYPANDOC_PANDOC`. See the next section for more information. ### Specifying the location of pandoc binaries You can point to a specific pandoc version by setting the environment variable `PYPANDOC_PANDOC` to the full `PATH` to the pandoc binary (`PYPANDOC_PANDOC=/home/x/whatever/pandoc` or `PYPANDOC_PANDOC=c:\pandoc\pandoc.exe`). If this environment variable is set, this is the only place where pandoc is searched for. In certain cases, e.g. pandoc is installed but a web server with its own user cannot find the binaries, it is useful to specify the location at runtime: ```python import os os.environ.setdefault('PYPANDOC_PANDOC', '/home/x/whatever/pandoc') ``` ## Usage There are two basic ways to use pypandoc: with input files or with input strings. ```python import pypandoc # With an input file: it will infer the input format from the filename output = pypandoc.convert_file('somefile.md', 'rst') # ...but you can overwrite the format via the `format` argument: output = pypandoc.convert_file('somefile.txt', 'rst', format='md') # alternatively you could just pass some string. In this case you need to # define the input format: output = pypandoc.convert_text('# some title', 'rst', format='md') # output == 'some title\r\n==========\r\n\r\n' ``` `convert_text` expects this string to be unicode or utf-8 encoded bytes. `convert_*` will always return a unicode string. It's also possible to directly let pandoc write the output to a file. This is the only way to convert to some output formats (e.g. odt, docx, epub, epub3, pdf). In that case `convert_*()` will return an empty string. ```python import pypandoc output = pypandoc.convert_file('somefile.md', 'docx', outputfile="somefile.docx") assert output == "" ``` It's also possible to specify multiple input files to pandoc, either as absolute paths, relative paths or file patterns. ```python import pypandoc # convert all markdown files in a chapters/ subdirectory. pypandoc.convert_file('chapters/*.md', 'docx', outputfile="somefile.docx") # convert all markdown files in the book1 and book2 directories. pypandoc.convert_file(['book1/*.md', 'book2/*.md'], 'docx', outputfile="somefile.docx") # convert the front from another drive, and all markdown files in the chapter directory. pypandoc.convert_file(['D:/book_front.md', 'book2/*.md'], 'docx', outputfile="somefile.docx") ``` pathlib is also supported. ```python import pypandoc from pathlib import Path # single file input = Path('somefile.md') output = input.with_suffix('.docx') pypandoc.convert_file(input, 'docx', outputfile=output) # convert all markdown files in a chapters/ subdirectory. pypandoc.convert_file(Path('chapters').glob('*.md'), 'docx', outputfile="somefile.docx") # convert all markdown files in the book1 and book2 directories. pypandoc.convert_file([*Path('book1').glob('*.md'), *Path('book2').glob('*.md')], 'docx', outputfile="somefile.docx") # pathlib globs must be unpacked if they are inside lists. ``` In addition to `format`, it is possible to pass `extra_args`. That makes it possible to access various pandoc options easily. ```python output = pypandoc.convert_text( '