proxygen
|
#include <Huffman.h>
Public Member Functions | |
HuffTree (const uint32_t *codes, const uint8_t *bits) | |
HuffTree (HuffTree &&tree)=default | |
~HuffTree () | |
bool | decode (const uint8_t *buf, uint32_t size, folly::fbstring &literal) const |
uint32_t | encode (folly::StringPiece literal, folly::io::QueueAppender &buf) const |
uint32_t | getEncodeSize (folly::StringPiece literal) const |
std::pair< uint32_t, uint8_t > | getCode (uint8_t ch) const |
const uint32_t * | codesTable () const |
const uint8_t * | bitsTable () const |
Protected Member Functions | |
HuffTree (const HuffTree &tree) | |
Protected Attributes | |
SuperHuffNode | table_ [46] |
Private Member Functions | |
void | fillIndex (SuperHuffNode &snode, uint32_t code, uint8_t bits, uint8_t ch, uint8_t level) |
void | buildTree () |
void | insert (uint32_t code, uint8_t bits, uint8_t ch) |
Private Attributes | |
uint32_t | nodes_ {0} |
const uint32_t * | codes_ |
const uint8_t * | bits_ |
Immutable Huffman tree used in the process of decoding. Traditionally the huffman tree is binary, but using that approach leads to major inefficiencies since it's using per-bit level processing and needs to perform several memory accesses and bit operations for every single bit. This implementation is using 8-bit level indexing and uses aggregated nodes that link up to 256 other nodes. The complexity for lookup is reduced from O(bits) to O(Bytes) which is 1 or 2 for most of the printable characters. The tradeoff of using this approach is using more memory and generating the tree is more laborious since we need to fill all the subtrees denoted by a character code, which is an unique prefix.
Example
bit stream: 00001111 1111010
the constructor assumes the codes and bits tables will not be freed, ideally they are static
Definition at line 21 of file Huffman.cpp.
References buildTree().
|
explicitdefault |
|
inline |
Definition at line 92 of file Huffman.h.
References proxygen::huffman::HuffNode::ch, fizz::decode(), encode(), folly::size(), uint32_t, and uint8_t.
|
explicitprotected |
const uint8_t * proxygen::huffman::HuffTree::bitsTable | ( | ) | const |
|
private |
initializes and builds the huffman tree
Definition at line 126 of file Huffman.cpp.
References bits_, codes_, i, insert(), proxygen::huffman::kTableSize, and uint32_t.
Referenced by HuffTree().
const uint32_t * proxygen::huffman::HuffTree::codesTable | ( | ) | const |
bool proxygen::huffman::HuffTree::decode | ( | const uint8_t * | buf, |
uint32_t | size, | ||
folly::fbstring & | literal | ||
) | const |
decode bitstream into a string literal
buf | start of a huffman-encoded bit stream |
size | size of the buffer |
literal | where to append decoded characters |
Definition at line 31 of file Huffman.cpp.
References proxygen::huffman::HuffNode::bits, proxygen::huffman::HuffNode::ch, proxygen::huffman::HuffNode::data, i, proxygen::huffman::SuperHuffNode::index, proxygen::huffman::HuffNode::isLeaf(), proxygen::huffman::HuffNode::metadata, folly::basic_fbstring< E, T, A, Storage >::push_back(), proxygen::huffman::HuffNode::superNodeIndex, table_, uint32_t, and uint8_t.
Referenced by TEST_F().
uint32_t proxygen::huffman::HuffTree::encode | ( | folly::StringPiece | literal, |
folly::io::QueueAppender & | buf | ||
) | const |
encode string literal into huffman encoded bit stream
literal | string to encode |
buf | where to append the encoded binary data |
Definition at line 133 of file Huffman.cpp.
References bits_, ch, codes_, i, folly::io::detail::Writable< Derived >::push(), folly::Range< Iter >::size(), uint32_t, uint8_t, and folly::io::detail::Writable< Derived >::writeBE().
Referenced by TEST_F().
|
private |
recursive function for generating subtrees
Definition at line 109 of file Huffman.cpp.
References proxygen::huffman::HuffNode::bits, proxygen::huffman::HuffNode::ch, ch, proxygen::huffman::HuffNode::data, proxygen::huffman::SuperHuffNode::index, proxygen::huffman::HuffNode::metadata, and uint8_t.
Referenced by insert().
get the binary representation for a given character, as a 32-bit word and a number of bits is represented on (<32). The code is aligned to LSB.
ch | ASCII character |
Example: 'e' will be encoded as 1 using 4 bits: 0001
Definition at line 196 of file Huffman.cpp.
uint32_t proxygen::huffman::HuffTree::getEncodeSize | ( | folly::StringPiece | literal | ) | const |
get the encode size for a string literal, works as a dry-run for the encode useful to allocate enough buffer space before doing the actual encode
literal | string literal |
Definition at line 182 of file Huffman.cpp.
References bits_, ch, i, folly::size(), folly::Range< Iter >::size(), uint32_t, and uint8_t.
Referenced by TEST_F().
insert a new character into the tree, identified by an unique code, a number of bits to represent it. The code is aligned at LSB.
Definition at line 78 of file Huffman.cpp.
References proxygen::huffman::HuffNode::data, fillIndex(), proxygen::huffman::SuperHuffNode::index, proxygen::huffman::HuffNode::isLeaf(), proxygen::huffman::HuffNode::isSuperNode, proxygen::huffman::HuffNode::metadata, nodes_, proxygen::huffman::HuffNode::superNodeIndex, table_, uint32_t, and x.
Referenced by buildTree().
|
private |
Definition at line 148 of file Huffman.h.
Referenced by bitsTable(), buildTree(), encode(), getCode(), and getEncodeSize().
|
private |
Definition at line 147 of file Huffman.h.
Referenced by buildTree(), codesTable(), encode(), and getCode().
|
private |
|
protected |