examples/SFExamples/oggvorbiscodec/src/libvorbis/doc/xml/03-codebook.xml

00001 <?xml version="1.0" standalone="no"?>
00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
00003                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
00004 
00005 ]>
00006 
00007 <section id="vorbis-spec-codebook">
00008 <sectioninfo>
00009 <releaseinfo>
00010  $Id: 03-codebook.xml 7186 2004-07-20 07:19:25Z xiphmont $
00011 </releaseinfo>
00012 </sectioninfo>
00013 <title>Probability Model and Codebooks</title>
00014 
00015 <section>
00016 <title>Overview</title>
00017 
00018 <para>
00019 Unlike practically every other mainstream audio codec, Vorbis has no
00020 statically configured probability model, instead packing all entropy
00021 decoding configuration, VQ and Huffman, into the bitstream itself in
00022 the third header, the codec setup header.  This packed configuration
00023 consists of multiple 'codebooks', each containing a specific
00024 Huffman-equivalent representation for decoding compressed codewords as
00025 well as an optional lookup table of output vector values to which a
00026 decoded Huffman value is applied as an offset, generating the final
00027 decoded output corresponding to a given compressed codeword.</para>
00028 
00029 <section><title>Bitwise operation</title>
00030 <para>
00031 The codebook mechanism is built on top of the vorbis bitpacker. Both
00032 the codebooks themselves and the codewords they decode are unrolled 
00033 from a packet as a series of arbitrary-width values read from the 
00034 stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para>
00035 </section>
00036 
00037 </section>
00038 
00039 <section>
00040 <title>Packed codebook format</title>
00041 
00042 <para>
00043 For purposes of the examples below, we assume that the storage
00044 system's native byte width is eight bits.  This is not universally
00045 true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion 
00046 relating to non-eight-bit bytes.</para>
00047 
00048 <section><title>codebook decode</title>
00049 
00050 <para>
00051 A codebook begins with a 24 bit sync pattern, 0x564342:
00052 
00053 <screen>
00054 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
00055 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
00056 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
00057 </screen></para>
00058 
00059 <para>
00060 16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields:
00061 
00062 <screen>
00063 
00064 byte 3: [ X X X X X X X X ] 
00065 byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
00066 
00067 byte 5: [ X X X X X X X X ] 
00068 byte 6: [ X X X X X X X X ] 
00069 byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
00070 
00071 </screen></para>
00072 
00073 <para>
00074 Next is the <varname>[ordered]</varname> bit flag:
00075 
00076 <screen>
00077 
00078 byte 8: [               X ] [ordered] (1 bit)
00079 
00080 </screen></para>
00081 
00082 <para>
00083 Each entry, numbering a
00084 total of <varname>[codebook_entries]</varname>, is assigned a codeword length.
00085 We now read the list of codeword lengths and store these lengths in
00086 the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is
00087 according to whether the <varname>[ordered]</varname> flag is set or unset.
00088 
00089 <itemizedlist>
00090 <listitem>
00091   <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not
00092   length ordered and the decoder needs to read each codeword length
00093   one-by-one.</para> 
00094 
00095   <para>The decoder first reads one additional bit flag, the
00096   <varname>[sparse]</varname> flag.  This flag determines whether or not the
00097   codebook contains unused entries that are not to be included in the
00098   codeword decode tree:
00099 
00100 <screen>
00101 byte 8: [             X 1 ] [sparse] flag (1 bit)
00102 </screen></para>
00103 
00104 <para>
00105   The decoder now performs for each of the <varname>[codebook_entries]</varname>
00106   codebook entries:
00107 
00108 <screen>
00109   
00110   1) if([sparse] is set){
00111 
00112          2) [flag] = read one bit;
00113          3) if([flag] is set){
00114 
00115               4) [length] = read a five bit unsigned integer;
00116               5) codeword length for this entry is [length]+1;
00117 
00118             } else {
00119 
00120               6) this entry is unused.  mark it as such.
00121 
00122             }
00123 
00124      } else the sparse flag is not set {
00125 
00126         7) [length] = read a five bit unsigned integer;
00127         8) the codeword length for this entry is [length]+1;
00128         
00129      }
00130 
00131 </screen></para>
00132 </listitem>
00133 <listitem>
00134   <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this
00135   codebook is encoded in ascending length order.  Rather than reading
00136   a length for every codeword, the encoder reads the number of
00137   codewords per length.  That is, beginning at entry zero:
00138 
00139 <screen>
00140   1) [current_entry] = 0;
00141   2) [current_length] = read a five bit unsigned integer and add 1;
00142   3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer
00143   4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive, 
00144     of the [codebook_codeword_lengths] array to [current_length]
00145   5) set [current_entry] to [number] + [current_entry]
00146   6) increment [current_length] by 1
00147   7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION; 
00148     the decoder will not be able to read this stream.
00149   8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
00150   9) done.
00151 </screen></para>
00152 </listitem>
00153 </itemizedlist>
00154 
00155 After all codeword lengths have been decoded, the decoder reads the
00156 vector lookup table.  Vorbis I supports three lookup types:
00157 <orderedlist>
00158 <listitem>
00159 <simpara>No lookup</simpara>
00160 </listitem><listitem>
00161 <simpara>Implicitly populated value mapping (lattice VQ)</simpara>
00162 </listitem><listitem>
00163 <simpara>Explicitly populated value mapping (tessellated or 'foam'
00164 VQ)</simpara>
00165 </listitem>
00166 </orderedlist>
00167 </para>
00168 
00169 <para>
00170 The lookup table type is read as a four bit unsigned integer:
00171 <screen>
00172   1) [codebook_lookup_type] = read four bits as an unsigned integer
00173 </screen></para>
00174 
00175 <para>
00176 Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>:
00177 <itemizedlist>
00178 <listitem>
00179 <para>Lookup type zero indicates no lookup to be read.  Proceed past
00180 lookup decode.</para>
00181 </listitem><listitem>
00182 <para>Lookup types one and two are similar, differing only in the
00183 number of lookup values to be read.  Lookup type one reads a list of
00184 values that are permuted in a set pattern to build a list of vectors,
00185 each vector of order <varname>[codebook_dimensions]</varname> scalars.  Lookup
00186 type two builds the same vector list, but reads each scalar for each
00187 vector explicitly, rather than building vectors from a smaller list of
00188 possible scalar values.  Lookup decode proceeds as follows:
00189 
00190 <screen>
00191   1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 
00192   2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 
00193   3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
00194   4) [codebook_sequence_p] = read 1 bit as a boolean flag
00195 
00196   if ( [codebook_lookup_type] is 1 ) {
00197    
00198      5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> )
00199 
00200   } else {
00201 
00202      6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname>
00203 
00204   }
00205 
00206   7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each; 
00207      store these in order in the array [codebook_multiplicands]
00208 </screen></para>
00209 </listitem><listitem>
00210 <para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved
00211 and indicates a stream that is not decodable by the specification in this
00212 document.</para>
00213 </listitem>
00214 </itemizedlist>
00215 </para>
00216 
00217 <para>
00218 An 'end of packet' during any read operation in the above steps is
00219 considered an error condition rendering the stream undecodable.</para>
00220 
00221 <section><title>Huffman decision tree representation</title>
00222 
00223 <para>
00224 The <varname>[codebook_codeword_lengths]</varname> array and
00225 <varname>[codebook_entries]</varname> value uniquely define the Huffman decision
00226 tree used for entropy decoding.</para>
00227 
00228 <para>
00229 Briefly, each used codebook entry (recall that length-unordered
00230 codebooks support unused codeword entries) is assigned, in order, the
00231 lowest valued unused binary Huffman codeword possible.  Assume the
00232 following codeword length list:
00233 
00234 <screen>
00235 entry 0: length 2
00236 entry 1: length 4
00237 entry 2: length 4
00238 entry 3: length 4
00239 entry 4: length 4
00240 entry 5: length 2
00241 entry 6: length 3
00242 entry 7: length 3
00243 </screen></para>
00244 
00245 <para>
00246 Assigning codewords in order (lowest possible value of the appropriate
00247 length to highest) results in the following codeword list:
00248 
00249 <screen>
00250 entry 0: length 2 codeword 00
00251 entry 1: length 4 codeword 0100
00252 entry 2: length 4 codeword 0101
00253 entry 3: length 4 codeword 0110
00254 entry 4: length 4 codeword 0111
00255 entry 5: length 2 codeword 10
00256 entry 6: length 3 codeword 110
00257 entry 7: length 3 codeword 111
00258 </screen></para>
00259 
00260 
00261 <note>
00262 <para>
00263 Unlike most binary numerical values in this document, we
00264 intend the above codewords to be read and used bit by bit from left to
00265 right, thus the codeword '001' is the bit string 'zero, zero, one'.
00266 When determining 'lowest possible value' in the assignment definition
00267 above, the leftmost bit is the MSb.</para>
00268 </note>
00269 
00270 <para>
00271 It is clear that the codeword length list represents a Huffman
00272 decision tree with the entry numbers equivalent to the leaves numbered
00273 left-to-right:
00274 
00275 <mediaobject>
00276 <imageobject>
00277  <imagedata fileref="hufftree.png" format="PNG"/>
00278 </imageobject>
00279 <textobject>
00280  <phrase>[huffman tree illustration]</phrase>
00281 </textobject>
00282 </mediaobject>
00283 </para>
00284 
00285 <para>
00286 As we assign codewords in order, we see that each choice constructs a
00287 new leaf in the leftmost possible position.</para>
00288 
00289 <para>
00290 Note that it's possible to underspecify or overspecify a Huffman tree
00291 via the length list.  In the above example, if codeword seven were
00292 eliminated, it's clear that the tree is unfinished:
00293 
00294 <mediaobject>
00295 <imageobject>
00296  <imagedata fileref="hufftree-under.png" format="PNG"/>
00297 </imageobject>
00298 <textobject>
00299  <phrase>[underspecified huffman tree illustration]</phrase>
00300 </textobject>
00301 </mediaobject>
00302 </para>
00303 
00304 <para>
00305 Similarly, in the original codebook, it's clear that the tree is fully
00306 populated and a ninth codeword is impossible.  Both underspecified and
00307 overspecified trees are an error condition rendering the stream
00308 undecodable.</para>
00309 
00310 <para>
00311 Codebook entries marked 'unused' are simply skipped in the assigning
00312 process.  They have no codeword and do not appear in the decision
00313 tree, thus it's impossible for any bit pattern read from the stream to
00314 decode to that entry number.</para>
00315 
00316 </section>
00317 
00318 <section><title>VQ lookup table vector representation</title>
00319 
00320 <para>
00321 Unpacking the VQ lookup table vectors relies on the following values:
00322 <programlisting>
00323 the [codebook_multiplicands] array
00324 [codebook_minimum_value]
00325 [codebook_delta_value]
00326 [codebook_sequence_p]
00327 [codebook_lookup_type]
00328 [codebook_entries]
00329 [codebook_dimensions]
00330 [codebook_lookup_values]
00331 </programlisting>
00332 </para>
00333 
00334 <para>
00335 Decoding (unpacking) a specific vector in the vector lookup table
00336 proceeds according to <varname>[codebook_lookup_type]</varname>.  The unpacked
00337 vector values are what a codebook would return during audio packet
00338 decode in a VQ context.</para>
00339 
00340 <section><title>Vector value decode: Lookup type 1</title>
00341 
00342 <para>
00343 Lookup type one specifies a lattice VQ lookup table built
00344 algorithmically from a list of scalar values.  Calculate (unpack) the
00345 final values of a codebook entry vector from the entries in
00346 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
00347 is the output vector representing the vector of values for entry number
00348 <varname>[lookup_offset]</varname> in this codebook):
00349 
00350 <screen>
00351   1) [last] = 0;
00352   2) [index_divisor] = 1;
00353   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
00354        
00355        4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer 
00356           division ) integer modulo [codebook_lookup_values]
00357 
00358        5) vector [value_vector] element [i] = 
00359             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
00360             [codebook_delta_value] + [codebook_minimum_value] + [last];
00361 
00362        6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
00363 
00364        7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
00365 
00366      }
00367  
00368   8) vector calculation completed.
00369 </screen></para>
00370 
00371 </section>
00372 
00373 <section><title>Vector value decode: Lookup type 2</title>
00374 
00375 <para>
00376 Lookup type two specifies a VQ lookup table in which each scalar in
00377 each vector is explicitly set by the <varname>[codebook_multiplicands]</varname>
00378 array in a one-to-one mapping.  Calculate [unpack] the
00379 final values of a codebook entry vector from the entries in
00380 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
00381 is the output vector representing the vector of values for entry number
00382 <varname>[lookup_offset]</varname> in this codebook):
00383 
00384 <screen>
00385   1) [last] = 0;
00386   2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
00387   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
00388 
00389        4) vector [value_vector] element [i] = 
00390             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
00391             [codebook_delta_value] + [codebook_minimum_value] + [last];
00392 
00393        5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] 
00394 
00395        6) increment [multiplicand_offset]
00396 
00397      }
00398  
00399   7) vector calculation completed.
00400 </screen></para>
00401 
00402 </section>
00403 
00404 </section>
00405 
00406 </section>
00407 
00408 </section>
00409 
00410 <section>
00411 <title>Use of the codebook abstraction</title>
00412 
00413 <para>
00414 The decoder uses the codebook abstraction much as it does the
00415 bit-unpacking convention; a specific codebook reads a
00416 codeword from the bitstream, decoding it into an entry number, and then
00417 returns that entry number to the decoder (when used in a scalar
00418 entropy coding context), or uses that entry number as an offset into
00419 the VQ lookup table, returning a vector of values (when used in a context
00420 desiring a VQ value). Scalar or VQ context is always explicit; any call
00421 to the codebook mechanism requests either a scalar entry number or a
00422 lookup vector.</para>
00423 
00424 <para>
00425 Note that VQ lookup type zero indicates that there is no lookup table;
00426 requesting decode using a codebook of lookup type 0 in any context
00427 expecting a vector return value (even in a case where a vector of
00428 dimension one) is forbidden.  If decoder setup or decode requests such
00429 an action, that is an error condition rendering the packet
00430 undecodable.</para>
00431 
00432 <para>
00433 Using a codebook to read from the packet bitstream consists first of
00434 reading and decoding the next codeword in the bitstream. The decoder
00435 reads bits until the accumulated bits match a codeword in the
00436 codebook.  This process can be though of as logically walking the
00437 Huffman decode tree by reading one bit at a time from the bitstream,
00438 and using the bit as a decision boolean to take the 0 branch (left in
00439 the above examples) or the 1 branch (right in the above examples).
00440 Walking the tree finishes when the decode process hits a leaf in the
00441 decision tree; the result is the entry number corresponding to that
00442 leaf.  Reading past the end of a packet propagates the 'end-of-stream'
00443 condition to the decoder.</para>
00444 
00445 <para>
00446 When used in a scalar context, the resulting codeword entry is the
00447 desired return value.</para>
00448 
00449 <para>
00450 When used in a VQ context, the codeword entry number is used as an
00451 offset into the VQ lookup table.  The value returned to the decoder is
00452 the vector of scalars corresponding to this offset.</para>
00453 
00454 </section>
00455 
00456 </section>
00457 
00458 <!-- end section of probablity model and codebooks -->

Generated by  doxygen 1.6.2