proxygen
proxygen/folly/folly/docs/GroupVarint.md
Go to the documentation of this file.
1 `folly/GroupVarint.h`
2 ---------------------
3 
4 `folly/GroupVarint.h` is an implementation of variable-length encoding for 32-
5 and 64-bit integers using the Group Varint encoding scheme as described in
6 Jeff Dean's [WSDM 2009 talk][wsdm] and in [Information Retrieval: Implementing
7 and Evaluating Search Engines][irbook].
8 
9 [wsdm]: http://research.google.com/people/jeff/WSDM09-keynote.pdf
10 [irbook]: http://www.ir.uwaterloo.ca/book/addenda-06-index-compression.html
11 
12 Briefly, a group of four 32-bit integers is encoded as a sequence of variable
13 length, between 5 and 17 bytes; the first byte encodes the length (in bytes)
14 of each integer in the group. A group of five 64-bit integers is encoded as a
15 sequence of variable length, between 7 and 42 bytes; the first two bytes
16 encode the length (in bytes) of each integer in the group.
17 
18 `GroupVarint.h` defines a few classes:
19 
20 * `GroupVarint<T>`, where `T` is `uint32_t` or `uint64_t`:
21 
22  Basic encoding / decoding interface, mainly aimed at encoding / decoding
23  one group at a time.
24 
25 * `GroupVarintEncoder<T, Output>`, where `T` is `uint32_t` or `uint64_t`,
26  and `Output` is a functor that accepts `StringPiece` objects as arguments:
27 
28  Streaming encoder: add values one at a time, and they will be
29  flushed to the output one group at a time. Handles the case where
30  the last group is incomplete (the number of integers to encode isn't
31  a multiple of the group size)
32 
33 * `GroupVarintDecoder<T>`, where `T` is `uint32_t` or `uint64_t`:
34 
35  Streaming decoder: extract values one at a time. Handles the case where
36  the last group is incomplete.
37 
38 The 32-bit implementation is significantly faster than the 64-bit
39 implementation; on platforms supporting the SSSE3 instruction set, we
40 use the PSHUFB instruction to speed up lookup, as described in [SIMD-Based
41 Decoding of Posting Lists][cikmpaper] (CIKM 2011).
42 
43 [cikmpaper]: http://www.stepanovpapers.com/CIKM_2011.pdf
44 
45 For more details, see the header file `folly/GroupVarint.h` and the
46 associated test file `folly/test/GroupVarintTest.cpp`.