A canonical Huffman code is a particular type of Huffman code with unique properties which allow it to be described in a very compact manner. Data compressors generally work in one of two ways. Either the decompressor can infer what codebook the compressor has used from previous context, or the compressor must tell the decompressor what the codebook is. Since a canonical Huffman codebook can be stored especially efficiently, most compressors start by generating a "normal" Huffman codebook, and then convert it to canonical Huffman before using it.

Property Value
dbo:abstract
• A canonical Huffman code is a particular type of Huffman code with unique properties which allow it to be described in a very compact manner. Data compressors generally work in one of two ways. Either the decompressor can infer what codebook the compressor has used from previous context, or the compressor must tell the decompressor what the codebook is. Since a canonical Huffman codebook can be stored especially efficiently, most compressors start by generating a "normal" Huffman codebook, and then convert it to canonical Huffman before using it. In order for a scheme such as the Huffman code to be decompressed, the same model that the encoding algorithm used to compress the source data must be provided to the decoding algorithm so that it can use it to decompress the encoded data. In standard Huffman coding this model takes the form of a tree of variable-length codes, with the most frequent symbols located at the top of the structure and being represented by the fewest bits. However, this code tree introduces two critical inefficiencies into an implementation of the coding scheme. Firstly, each node of the tree must store either references to its child nodes or the symbol that it represents. This is expensive in memory usage and if there is a high proportion of unique symbols in the source data then the size of the code tree can account for a significant amount of the overall encoded data. Secondly, traversing the tree is computationally costly, since it requires the algorithm to jump randomly through the structure in memory as each bit in the encoded data is read in. Canonical Huffman codes address these two issues by generating the codes in a clear standardized format; all the codes for a given length are assigned their values sequentially. This means that instead of storing the structure of the code tree for decompression only the lengths of the codes are required, reducing the size of the encoded data. Additionally, because the codes are sequential, the decoding algorithm can be dramatically simplified so that it is computationally efficient. (en)
dbo:wikiPageExtracted
• 2019-12-25 12:17:16Z (xsd:date)
dbo:wikiPageID
• 6946171 (xsd:integer)
dbo:wikiPageLength
• 9170 (xsd:integer)
dbo:wikiPageModified
• 2019-12-25 12:17:13Z (xsd:date)
dbo:wikiPageOutDegree
• 20 (xsd:integer)
dbo:wikiPageRevisionID
• 932375970 (xsd:integer)