jbig2enc: Documentation

Adam Langley <agl@imperialviolet.org>

What is JBIG2

JBIG2 is an image compression standard from the same people who brought you the JPEG format. It compresses 1bpp (black and white) images only. These images can consist of only black and while, there are no shades of gray - that would be a grayscale image. Any "gray" areas must, therefore be simulated using black dots in a pattern called halftoning.

The JBIG2 standard has several major areas:

There are two major compression technologies which JBIG2 builds on: arithmetic encoding and Huffman encoding. You can choose between them and use both in the same JBIG2 file, but this is rare. Arithmetic encoding is slower, but compresses better. Huffman encoding was included in the standard because one of the (intended) users of JBIG2 were fax machines and they might not have the processing power for arithmetic coding.

jbig2enc only supports arithmetic encoding

Generic region coding

Generic region coding is used to compress bitmaps. It is progressive and uses a context around the current pixel to be decoded to estimate the probability that the pixel will be black. If the probability is 50% it uses a single bit to encode that pixel. If the probability is 99% then it takes less than a bit to encode a black pixel, but more than a bit to encode a white one.

The context can only refer to pixels above and to the left of the current pixel, because the decoder doesn't know the values of any of the other pixels yet (pixels are decoded left-to-right, top-to-bottom). Based on the values of these pixels it estimates a probability and updates it's estimation for that context based on the actual pixel found. All contexts start off with a 50% chance of being black.

You can encode whole pages with this and you will end up with a perfect reconstruction of the page. However, we can do better...

Symbol encoding

Most input images to JBIG2 encoders are scanned text. These have many repeating symbols (letters). The idea of symbol encoding is to encode what a letter “a” looks like and, for all the “a”s on the page, just give their locations. (This is lossy encoding)

Unfortunately, all scanned images have noise in them: no two “a”s will look quite the same so we have to group all the symbols on a page into groups. Hopefully each member of a given group will be the same letter, otherwise we might place the wrong letter on the page! These, very surprising, errors are called cootoots.

However, assuming that we group the symbols correctly, we can get great compression this way. Remember that the stricter the classifier, the more symbol groups (classes) will be generated, leading to bigger files. But, also, there is a lower risk of cootoots (misclassification).

This is great, but we can do better...

Symbol retention

Symbol retention is the process of compressing multi-page documents by extracting the symbols from all the pages at once and classifing them all together. Thus we only have to encoding a single letter “a” for the whole document (in an ideal world).

This is obviously slower, but generates smaller files (about half the size on average, with a decent number of similar typeset pages).

One downside you should be aware of: If you are generating JBIG2 streams for inclusion to a linearised PDF file, the PDF reader has to download all the symbols before it can display the first page. There is solution to this involing multiple dictionaries and symbol importing, but that's not currently supported by jbig2enc.

Refinement

Symbol encoding is lossy because of noise, which is classified away and also because the symbol classifier is imperfect. Refinement allows us, when placing a symbol on the page, to encode the difference between the actual symbol at that location, and what the classifer told us was “close enough”. We can choose to do this for each symbol on the page, so we don't have to refine when we are only a couple of pixel off. If we refine whenever we a wrong pixel, we have lossless encoding using symbols.

Halftoning

jbig2enc doesn't support this at all - so I will only mention this quickly. The JBIG2 standard supports the efficient encoding of halftoning by building a dictionary of halftone blocks (like the dictionaries of symbols which we build for text pages). The lack of support for halftones in G4 (the old fax standard) was a major weakness.

Some numbers

My sample is a set of 90 pages scanning pages from the middle of a recent book. The scanned images are 300dpi grayscale and they are being upsampled to 600dpi 1-bpp for encoding.

Command line options

jbig2enc comes with a handy command line tool for encoding images.