00001 <?xml version="1.0" standalone="no"?> 00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 00003 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ 00004 00005 ]> 00006 00007 <section id="vorbis-spec-codebook"> 00008 <sectioninfo> 00009 <releaseinfo> 00010 $Id: 03-codebook.xml 7186 2004-07-20 07:19:25Z xiphmont $ 00011 </releaseinfo> 00012 </sectioninfo> 00013 <title>Probability Model and Codebooks</title> 00014 00015 <section> 00016 <title>Overview</title> 00017 00018 <para> 00019 Unlike practically every other mainstream audio codec, Vorbis has no 00020 statically configured probability model, instead packing all entropy 00021 decoding configuration, VQ and Huffman, into the bitstream itself in 00022 the third header, the codec setup header. This packed configuration 00023 consists of multiple 'codebooks', each containing a specific 00024 Huffman-equivalent representation for decoding compressed codewords as 00025 well as an optional lookup table of output vector values to which a 00026 decoded Huffman value is applied as an offset, generating the final 00027 decoded output corresponding to a given compressed codeword.</para> 00028 00029 <section><title>Bitwise operation</title> 00030 <para> 00031 The codebook mechanism is built on top of the vorbis bitpacker. Both 00032 the codebooks themselves and the codewords they decode are unrolled 00033 from a packet as a series of arbitrary-width values read from the 00034 stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para> 00035 </section> 00036 00037 </section> 00038 00039 <section> 00040 <title>Packed codebook format</title> 00041 00042 <para> 00043 For purposes of the examples below, we assume that the storage 00044 system's native byte width is eight bits. This is not universally 00045 true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion 00046 relating to non-eight-bit bytes.</para> 00047 00048 <section><title>codebook decode</title> 00049 00050 <para> 00051 A codebook begins with a 24 bit sync pattern, 0x564342: 00052 00053 <screen> 00054 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42) 00055 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43) 00056 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56) 00057 </screen></para> 00058 00059 <para> 00060 16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields: 00061 00062 <screen> 00063 00064 byte 3: [ X X X X X X X X ] 00065 byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned) 00066 00067 byte 5: [ X X X X X X X X ] 00068 byte 6: [ X X X X X X X X ] 00069 byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned) 00070 00071 </screen></para> 00072 00073 <para> 00074 Next is the <varname>[ordered]</varname> bit flag: 00075 00076 <screen> 00077 00078 byte 8: [ X ] [ordered] (1 bit) 00079 00080 </screen></para> 00081 00082 <para> 00083 Each entry, numbering a 00084 total of <varname>[codebook_entries]</varname>, is assigned a codeword length. 00085 We now read the list of codeword lengths and store these lengths in 00086 the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is 00087 according to whether the <varname>[ordered]</varname> flag is set or unset. 00088 00089 <itemizedlist> 00090 <listitem> 00091 <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not 00092 length ordered and the decoder needs to read each codeword length 00093 one-by-one.</para> 00094 00095 <para>The decoder first reads one additional bit flag, the 00096 <varname>[sparse]</varname> flag. This flag determines whether or not the 00097 codebook contains unused entries that are not to be included in the 00098 codeword decode tree: 00099 00100 <screen> 00101 byte 8: [ X 1 ] [sparse] flag (1 bit) 00102 </screen></para> 00103 00104 <para> 00105 The decoder now performs for each of the <varname>[codebook_entries]</varname> 00106 codebook entries: 00107 00108 <screen> 00109 00110 1) if([sparse] is set){ 00111 00112 2) [flag] = read one bit; 00113 3) if([flag] is set){ 00114 00115 4) [length] = read a five bit unsigned integer; 00116 5) codeword length for this entry is [length]+1; 00117 00118 } else { 00119 00120 6) this entry is unused. mark it as such. 00121 00122 } 00123 00124 } else the sparse flag is not set { 00125 00126 7) [length] = read a five bit unsigned integer; 00127 8) the codeword length for this entry is [length]+1; 00128 00129 } 00130 00131 </screen></para> 00132 </listitem> 00133 <listitem> 00134 <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this 00135 codebook is encoded in ascending length order. Rather than reading 00136 a length for every codeword, the encoder reads the number of 00137 codewords per length. That is, beginning at entry zero: 00138 00139 <screen> 00140 1) [current_entry] = 0; 00141 2) [current_length] = read a five bit unsigned integer and add 1; 00142 3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer 00143 4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive, 00144 of the [codebook_codeword_lengths] array to [current_length] 00145 5) set [current_entry] to [number] + [current_entry] 00146 6) increment [current_length] by 1 00147 7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION; 00148 the decoder will not be able to read this stream. 00149 8) if [current_entry] is less than [codebook_entries], repeat process starting at 3) 00150 9) done. 00151 </screen></para> 00152 </listitem> 00153 </itemizedlist> 00154 00155 After all codeword lengths have been decoded, the decoder reads the 00156 vector lookup table. Vorbis I supports three lookup types: 00157 <orderedlist> 00158 <listitem> 00159 <simpara>No lookup</simpara> 00160 </listitem><listitem> 00161 <simpara>Implicitly populated value mapping (lattice VQ)</simpara> 00162 </listitem><listitem> 00163 <simpara>Explicitly populated value mapping (tessellated or 'foam' 00164 VQ)</simpara> 00165 </listitem> 00166 </orderedlist> 00167 </para> 00168 00169 <para> 00170 The lookup table type is read as a four bit unsigned integer: 00171 <screen> 00172 1) [codebook_lookup_type] = read four bits as an unsigned integer 00173 </screen></para> 00174 00175 <para> 00176 Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>: 00177 <itemizedlist> 00178 <listitem> 00179 <para>Lookup type zero indicates no lookup to be read. Proceed past 00180 lookup decode.</para> 00181 </listitem><listitem> 00182 <para>Lookup types one and two are similar, differing only in the 00183 number of lookup values to be read. Lookup type one reads a list of 00184 values that are permuted in a set pattern to build a list of vectors, 00185 each vector of order <varname>[codebook_dimensions]</varname> scalars. Lookup 00186 type two builds the same vector list, but reads each scalar for each 00187 vector explicitly, rather than building vectors from a smaller list of 00188 possible scalar values. Lookup decode proceeds as follows: 00189 00190 <screen> 00191 1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 00192 2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 00193 3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1 00194 4) [codebook_sequence_p] = read 1 bit as a boolean flag 00195 00196 if ( [codebook_lookup_type] is 1 ) { 00197 00198 5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> ) 00199 00200 } else { 00201 00202 6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname> 00203 00204 } 00205 00206 7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each; 00207 store these in order in the array [codebook_multiplicands] 00208 </screen></para> 00209 </listitem><listitem> 00210 <para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved 00211 and indicates a stream that is not decodable by the specification in this 00212 document.</para> 00213 </listitem> 00214 </itemizedlist> 00215 </para> 00216 00217 <para> 00218 An 'end of packet' during any read operation in the above steps is 00219 considered an error condition rendering the stream undecodable.</para> 00220 00221 <section><title>Huffman decision tree representation</title> 00222 00223 <para> 00224 The <varname>[codebook_codeword_lengths]</varname> array and 00225 <varname>[codebook_entries]</varname> value uniquely define the Huffman decision 00226 tree used for entropy decoding.</para> 00227 00228 <para> 00229 Briefly, each used codebook entry (recall that length-unordered 00230 codebooks support unused codeword entries) is assigned, in order, the 00231 lowest valued unused binary Huffman codeword possible. Assume the 00232 following codeword length list: 00233 00234 <screen> 00235 entry 0: length 2 00236 entry 1: length 4 00237 entry 2: length 4 00238 entry 3: length 4 00239 entry 4: length 4 00240 entry 5: length 2 00241 entry 6: length 3 00242 entry 7: length 3 00243 </screen></para> 00244 00245 <para> 00246 Assigning codewords in order (lowest possible value of the appropriate 00247 length to highest) results in the following codeword list: 00248 00249 <screen> 00250 entry 0: length 2 codeword 00 00251 entry 1: length 4 codeword 0100 00252 entry 2: length 4 codeword 0101 00253 entry 3: length 4 codeword 0110 00254 entry 4: length 4 codeword 0111 00255 entry 5: length 2 codeword 10 00256 entry 6: length 3 codeword 110 00257 entry 7: length 3 codeword 111 00258 </screen></para> 00259 00260 00261 <note> 00262 <para> 00263 Unlike most binary numerical values in this document, we 00264 intend the above codewords to be read and used bit by bit from left to 00265 right, thus the codeword '001' is the bit string 'zero, zero, one'. 00266 When determining 'lowest possible value' in the assignment definition 00267 above, the leftmost bit is the MSb.</para> 00268 </note> 00269 00270 <para> 00271 It is clear that the codeword length list represents a Huffman 00272 decision tree with the entry numbers equivalent to the leaves numbered 00273 left-to-right: 00274 00275 <mediaobject> 00276 <imageobject> 00277 <imagedata fileref="hufftree.png" format="PNG"/> 00278 </imageobject> 00279 <textobject> 00280 <phrase>[huffman tree illustration]</phrase> 00281 </textobject> 00282 </mediaobject> 00283 </para> 00284 00285 <para> 00286 As we assign codewords in order, we see that each choice constructs a 00287 new leaf in the leftmost possible position.</para> 00288 00289 <para> 00290 Note that it's possible to underspecify or overspecify a Huffman tree 00291 via the length list. In the above example, if codeword seven were 00292 eliminated, it's clear that the tree is unfinished: 00293 00294 <mediaobject> 00295 <imageobject> 00296 <imagedata fileref="hufftree-under.png" format="PNG"/> 00297 </imageobject> 00298 <textobject> 00299 <phrase>[underspecified huffman tree illustration]</phrase> 00300 </textobject> 00301 </mediaobject> 00302 </para> 00303 00304 <para> 00305 Similarly, in the original codebook, it's clear that the tree is fully 00306 populated and a ninth codeword is impossible. Both underspecified and 00307 overspecified trees are an error condition rendering the stream 00308 undecodable.</para> 00309 00310 <para> 00311 Codebook entries marked 'unused' are simply skipped in the assigning 00312 process. They have no codeword and do not appear in the decision 00313 tree, thus it's impossible for any bit pattern read from the stream to 00314 decode to that entry number.</para> 00315 00316 </section> 00317 00318 <section><title>VQ lookup table vector representation</title> 00319 00320 <para> 00321 Unpacking the VQ lookup table vectors relies on the following values: 00322 <programlisting> 00323 the [codebook_multiplicands] array 00324 [codebook_minimum_value] 00325 [codebook_delta_value] 00326 [codebook_sequence_p] 00327 [codebook_lookup_type] 00328 [codebook_entries] 00329 [codebook_dimensions] 00330 [codebook_lookup_values] 00331 </programlisting> 00332 </para> 00333 00334 <para> 00335 Decoding (unpacking) a specific vector in the vector lookup table 00336 proceeds according to <varname>[codebook_lookup_type]</varname>. The unpacked 00337 vector values are what a codebook would return during audio packet 00338 decode in a VQ context.</para> 00339 00340 <section><title>Vector value decode: Lookup type 1</title> 00341 00342 <para> 00343 Lookup type one specifies a lattice VQ lookup table built 00344 algorithmically from a list of scalar values. Calculate (unpack) the 00345 final values of a codebook entry vector from the entries in 00346 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname> 00347 is the output vector representing the vector of values for entry number 00348 <varname>[lookup_offset]</varname> in this codebook): 00349 00350 <screen> 00351 1) [last] = 0; 00352 2) [index_divisor] = 1; 00353 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) { 00354 00355 4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer 00356 division ) integer modulo [codebook_lookup_values] 00357 00358 5) vector [value_vector] element [i] = 00359 ( [codebook_multiplicands] array element number [multiplicand_offset] ) * 00360 [codebook_delta_value] + [codebook_minimum_value] + [last]; 00361 00362 6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] 00363 00364 7) [index_divisor] = [index_divisor] * [codebook_lookup_values] 00365 00366 } 00367 00368 8) vector calculation completed. 00369 </screen></para> 00370 00371 </section> 00372 00373 <section><title>Vector value decode: Lookup type 2</title> 00374 00375 <para> 00376 Lookup type two specifies a VQ lookup table in which each scalar in 00377 each vector is explicitly set by the <varname>[codebook_multiplicands]</varname> 00378 array in a one-to-one mapping. Calculate [unpack] the 00379 final values of a codebook entry vector from the entries in 00380 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname> 00381 is the output vector representing the vector of values for entry number 00382 <varname>[lookup_offset]</varname> in this codebook): 00383 00384 <screen> 00385 1) [last] = 0; 00386 2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions] 00387 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) { 00388 00389 4) vector [value_vector] element [i] = 00390 ( [codebook_multiplicands] array element number [multiplicand_offset] ) * 00391 [codebook_delta_value] + [codebook_minimum_value] + [last]; 00392 00393 5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] 00394 00395 6) increment [multiplicand_offset] 00396 00397 } 00398 00399 7) vector calculation completed. 00400 </screen></para> 00401 00402 </section> 00403 00404 </section> 00405 00406 </section> 00407 00408 </section> 00409 00410 <section> 00411 <title>Use of the codebook abstraction</title> 00412 00413 <para> 00414 The decoder uses the codebook abstraction much as it does the 00415 bit-unpacking convention; a specific codebook reads a 00416 codeword from the bitstream, decoding it into an entry number, and then 00417 returns that entry number to the decoder (when used in a scalar 00418 entropy coding context), or uses that entry number as an offset into 00419 the VQ lookup table, returning a vector of values (when used in a context 00420 desiring a VQ value). Scalar or VQ context is always explicit; any call 00421 to the codebook mechanism requests either a scalar entry number or a 00422 lookup vector.</para> 00423 00424 <para> 00425 Note that VQ lookup type zero indicates that there is no lookup table; 00426 requesting decode using a codebook of lookup type 0 in any context 00427 expecting a vector return value (even in a case where a vector of 00428 dimension one) is forbidden. If decoder setup or decode requests such 00429 an action, that is an error condition rendering the packet 00430 undecodable.</para> 00431 00432 <para> 00433 Using a codebook to read from the packet bitstream consists first of 00434 reading and decoding the next codeword in the bitstream. The decoder 00435 reads bits until the accumulated bits match a codeword in the 00436 codebook. This process can be though of as logically walking the 00437 Huffman decode tree by reading one bit at a time from the bitstream, 00438 and using the bit as a decision boolean to take the 0 branch (left in 00439 the above examples) or the 1 branch (right in the above examples). 00440 Walking the tree finishes when the decode process hits a leaf in the 00441 decision tree; the result is the entry number corresponding to that 00442 leaf. Reading past the end of a packet propagates the 'end-of-stream' 00443 condition to the decoder.</para> 00444 00445 <para> 00446 When used in a scalar context, the resulting codeword entry is the 00447 desired return value.</para> 00448 00449 <para> 00450 When used in a VQ context, the codeword entry number is used as an 00451 offset into the VQ lookup table. The value returned to the decoder is 00452 the vector of scalars corresponding to this offset.</para> 00453 00454 </section> 00455 00456 </section> 00457 00458 <!-- end section of probablity model and codebooks -->