examples/SFExamples/oggvorbiscodec94/src/libvorbis/doc/xml/08-residue.xml

00001 <?xml version="1.0" standalone="no"?>
00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
00003                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
00004 
00005 ]>
00006 
00007 <section id="vorbis-spec-residue">
00008 <sectioninfo>
00009  <releaseinfo>
00010   $Id: 08-residue.xml 10466 2005-11-28 00:34:44Z giles $
00011  </releaseinfo>
00012 </sectioninfo>
00013 <title>Residue setup and decode</title>
00014 
00015 
00016 <section>
00017 <title>Overview</title>
00018 
00019 <para>
00020 A residue vector represents the fine detail of the audio spectrum of
00021 one channel in an audio frame after the encoder subtracts the floor
00022 curve and performs any channel coupling.  A residue vector may
00023 represent spectral lines, spectral magnitude, spectral phase or
00024 hybrids as mixed by channel coupling.  The exact semantic content of
00025 the vector does not matter to the residue abstraction.</para>
00026 
00027 <para>
00028 Whatever the exact qualities, the Vorbis residue abstraction codes the
00029 residue vectors into the bitstream packet, and then reconstructs the
00030 vectors during decode.  Vorbis makes use of three different encoding
00031 variants (numbered 0, 1 and 2) of the same basic vector encoding
00032 abstraction.</para>
00033 
00034 </section>
00035 
00036 <section>
00037 <title>Residue format</title>
00038 
00039 <para>
00040 Residue format partitions each vector in the vector bundle into chunks,
00041 classifies each chunk, encodes the chunk classifications and finally
00042 encodes the chunks themselves using the the specific VQ arrangement
00043 defined for each selected classification.
00044 The exact interleaving and partitioning vary by residue encoding number,
00045 however the high-level process used to classify and encode the residue 
00046 vector is the same in all three variants.</para>
00047 
00048 <para>
00049 A set of coded residue vectors are all of the same length.  High level
00050 coding structure, ignoring for the moment exactly how a partition is
00051 encoded and simply trusting that it is, is as follows:</para>
00052 
00053 <para>
00054 <itemizedlist>
00055 <listitem><para>Each vector is partitioned into multiple equal sized chunks
00056 according to configuration specified.  If we have a vector size of
00057 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
00058 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
00059 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>.  It is
00060 important to note that the integer division truncates.  In the below
00061 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
00062 
00063 <listitem><para>Each partition in each vector has a classification number that
00064 specifies which of multiple configured VQ codebook setups are used to
00065 decode that partition.  The classification numbers of each partition
00066 can be thought of as forming a vector in their own right, as in the
00067 illustration below.  Just as the residue vectors are coded in grouped
00068 partitions to increase encoding efficiency, the classification vector
00069 is also partitioned into chunks.  The integer elements of each scalar
00070 in a classification chunk are built into a single scalar that
00071 represents the classification numbers in that chunk.  In the below
00072 example, the classification codeword encodes two classification
00073 numbers.</para></listitem>
00074 
00075 <listitem><para>The values in a residue vector may be encoded monolithically in a
00076 single pass through the residue vector, but more often efficient
00077 codebook design dictates that each vector is encoded as the additive
00078 sum of several passes through the residue vector using more than one
00079 VQ codebook.  Thus, each residue value potentially accumulates values
00080 from multiple decode passes.  The classification value associated with
00081 a partition is the same in each pass, thus the classification codeword
00082 is coded only in the first pass.</para></listitem>
00083 
00084 </itemizedlist>
00085 </para>
00086 
00087 <mediaobject>
00088 <imageobject>
00089  <imagedata fileref="residue-pack.png" format="PNG"/>
00090 </imageobject>
00091 <textobject>
00092  <phrase>[illustration of residue vector format]</phrase>
00093 </textobject>
00094 </mediaobject>
00095 
00096 </section>
00097 
00098 <section><title>residue 0</title>
00099 
00100 <para>
00101 Residue 0 and 1 differ only in the way the values within a residue
00102 partition are interleaved during partition encoding (visually treated
00103 as a black box--or cyan box or brown box--in the above figure).</para>
00104 
00105 <para>
00106 Residue encoding 0 interleaves VQ encoding according to the
00107 dimension of the codebook used to encode a partition in a specific
00108 pass.  The dimension of the codebook need not be the same in multiple
00109 passes, however the partition size must be an even multiple of the
00110 codebook dimension.</para>
00111 
00112 <para>
00113 As an example, assume a partition vector of size eight, to be encoded
00114 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
00115 
00116 <programlisting>
00117 
00118             original residue vector: [ 0 1 2 3 4 5 6 7 ]
00119 
00120 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
00121 
00122 codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
00123 
00124 codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
00125 
00126 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
00127 
00128 </programlisting>
00129 
00130 <para>
00131 It is worth mentioning at this point that no configurable value in the
00132 residue coding setup is restricted to a power of two.</para>
00133 
00134 </section>
00135 
00136 <section><title>residue 1</title>
00137 
00138 <para>
00139 Residue 1 does not interleave VQ encoding.  It represents partition
00140 vector scalars in order.  As with residue 0, however, partition length
00141 must be an integer multiple of the codebook dimension, although
00142 dimension may vary from pass to pass.</para>
00143 
00144 <para>
00145 As an example, assume a partition vector of size eight, to be encoded
00146 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
00147 
00148 <programlisting>
00149 
00150             original residue vector: [ 0 1 2 3 4 5 6 7 ]
00151 
00152 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
00153 
00154 codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
00155 
00156 codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
00157 
00158 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
00159 
00160 </programlisting>
00161 
00162 </section>
00163 
00164 <section><title>residue 2</title>
00165 
00166 <para>
00167 Residue type two can be thought of as a variant of residue type 1.
00168 Rather than encoding multiple passed-in vectors as in residue type 1,
00169 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
00170 interleaved and flattened into a single vector of length
00171 <emphasis>ch</emphasis>*<emphasis>n</emphasis>.  Encoding then proceeds as in type 1. Decoding is
00172 as in type 1 with decode interleave reversed. If operating on a single
00173 vector to begin with, residue type 1 and type 2 are equivalent.</para>
00174 
00175 <mediaobject>
00176 <imageobject>
00177  <imagedata fileref="residue2.png" format="PNG"/>
00178 </imageobject>
00179 <textobject>
00180  <phrase>[illustration of residue type 2]</phrase>
00181 </textobject>
00182 </mediaobject>
00183 
00184 </section>
00185 
00186 <section>
00187 <title>Residue decode</title>
00188 
00189 <section><title>header decode</title>
00190 
00191 <para>
00192 Header decode for all three residue types is identical.</para>
00193 <programlisting>
00194   1) [residue_begin] = read 24 bits as unsigned integer
00195   2) [residue_end] = read 24 bits as unsigned integer
00196   3) [residue_partition_size] = read 24 bits as unsigned integer and add one
00197   4) [residue_classifications] = read 6 bits as unsigned integer and add one
00198   5) [residue_classbook] = read 8 bits as unsigned integer
00199 </programlisting>
00200 
00201 <para>
00202 <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
00203 sub-portion of each vector that is actually coded; it implements akin
00204 to a bandpass where, for coding purposes, the vector effectively
00205 begins at element <varname>[residue_begin]</varname> and ends at
00206 <varname>[residue_end]</varname>.  Preceding and following values in the unpacked
00207 vectors are zeroed.  Note that for residue type 2, these values as
00208 well as <varname>[residue_partition_size]</varname>apply to the interleaved
00209 vector, not the individual vectors before interleave.
00210 <varname>[residue_partition_size]</varname> is as explained above,
00211 <varname>[residue_classifications]</varname> is the number of possible
00212 classification to which a partition can belong and
00213 <varname>[residue_classbook]</varname> is the codebook number used to code
00214 classification codewords.  The number of dimensions in book
00215 <varname>[residue_classbook]</varname> determines how many classification values
00216 are grouped into a single classification codeword.</para>
00217 
00218 <para>
00219 Next we read a bitmap pattern that specifies which partition classes
00220 code values in which passes.</para>
00221 
00222 <programlisting>
00223   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
00224   
00225        2) [high_bits] = 0
00226        3) [low_bits] = read 3 bits as unsigned integer
00227        4) [bitflag] = read one bit as boolean
00228        5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
00229        6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
00230      }
00231   7) done
00232 </programlisting>
00233 
00234 <para>
00235 Finally, we read in a list of book numbers, each corresponding to
00236 specific bit set in the cascade bitmap.  We loop over the possible
00237 codebook classifications and the maximum possible number of encoding
00238 stages (8 in Vorbis I, as constrained by the elements of the cascade
00239 bitmap being eight bits):</para>
00240 
00241 <programlisting>
00242   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
00243   
00244        2) iterate [j] over the range 0 ... 7 {
00245   
00246             3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
00247 
00248                  4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
00249 
00250                } else {
00251 
00252                  5) array [residue_books] element [i][j] = unused
00253 
00254                }
00255           }
00256       }
00257 
00258   6) done
00259 </programlisting>
00260 
00261 <para>
00262 An end-of-packet condition at any point in header decode renders the
00263 stream undecodable.  In addition, any codebook number greater than the
00264 maximum numbered codebook set up in this stream also renders the
00265 stream undecodable.</para>
00266 
00267 </section>
00268 
00269 <section><title>packet decode</title>
00270 
00271 <para>
00272 Format 0 and 1 packet decode is identical except for specific
00273 partition interleave.  Format 2 packet decode can be built out of the
00274 format 1 decode process.  Thus we describe first the decode
00275 infrastructure identical to all three formats.</para>
00276 
00277 <para>
00278 In addition to configuration information, the residue decode process
00279 is passed the number of vectors in the submap bundle and a vector of
00280 flags indicating if any of the vectors are not to be decoded.  If the
00281 passed in number of vectors is 3 and vector number 1 is marked 'do not
00282 decode', decode skips vector 1 during the decode loop.  However, even
00283 'do not decode' vectors are allocated and zeroed.</para>
00284 
00285 <para>
00286 The following convenience values are conceptually useful to clarifying
00287 the decode process:</para>
00288 
00289 <programlisting>
00290   1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
00291   2) [n_to_read] = [residue_end] - [residue_begin]
00292   3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
00293 </programlisting>
00294 
00295 <para>
00296 Packet decode proceeds as follows, matching the description offered earlier in the document.  We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para>
00297 <programlisting>
00298   1) allocate and zero all vectors that will be returned.
00299   2) iterate [pass] over the range 0 ... 7 {
00300 
00301        3) [partition_count] = 0
00302 
00303        4) while [partition_count] is less than [partitions_to_read]
00304 
00305             5) if ([pass] is zero) {
00306      
00307                  6) iterate [j] over the range 0 .. [ch]-1 {
00308 
00309                       7) if vector [j] is not marked 'do not decode' {
00310 
00311                            8) [temp] = read from packet using codebook [residue_classbook] in scalar context
00312                            9) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
00313 
00314                                10) array [classifications] element [j],([i]+[partition_count]) =
00315                                    [temp] integer modulo [residue_classifications]
00316                                11) [temp] = [temp] / [residue_classifications] using integer division
00317 
00318                               }
00319       
00320                          }
00321             
00322                     }
00323           
00324                }
00325 
00326            12) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count] 
00327                is also less than [partitions_to_read] {
00328 
00329                  13) iterate [j] over the range 0 .. [ch]-1 {
00330    
00331                       14) if vector [j] is not marked 'do not decode' {
00332    
00333                            15) [vqclass] = array [classifications] element [j],[partition_count]
00334                            16) [vqbook] = array [residue_books] element [vqclass],[pass]
00335                            17) if ([vqbook] is not 'unused') {
00336    
00337                                 18) decode partition into output vector number [j], starting at scalar 
00338                                     offset [residue_begin]+[partition_count]*[residue_partition_size] using 
00339                                     codebook number [vqbook] in VQ context
00340                           }
00341                      }
00342    
00343                  19) increment [partition_count] by one
00344 
00345                }
00346           }
00347      }
00348  
00349  20) done
00350 
00351 </programlisting>
00352 
00353 <para>
00354 An end-of-packet condition during packet decode is to be considered a
00355 nominal occurrence.  Decode returns the result of vector decode up to
00356 that point.</para>
00357 
00358 </section>
00359 
00360 <section><title>format 0 specifics</title>
00361 
00362 <para>
00363 Format zero decodes partitions exactly as described earlier in the
00364 'Residue Format: residue 0' section.  The following pseudocode
00365 presents the same algorithm. Assume:</para>
00366 
00367 <para>
00368 <itemizedlist>
00369 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
00370 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
00371 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
00372 </itemizedlist>
00373 </para>
00374 
00375 <programlisting>
00376  1) [step] = [n] / [codebook_dimensions]
00377  2) iterate [i] over the range 0 ... [step]-1 {
00378 
00379       3) vector [entry_temp] = read vector from packet using current codebook in VQ context
00380       4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
00381 
00382            5) vector [v] element ([offset]+[i]+[j]*[step]) =
00383                 vector [v] element ([offset]+[i]+[j]*[step]) +
00384                 vector [entry_temp] element [j]
00385 
00386          }
00387 
00388     }
00389 
00390   6) done
00391 
00392 </programlisting>
00393 
00394 </section>
00395 
00396 <section><title>format 1 specifics</title>
00397 
00398 <para>
00399 Format 1 decodes partitions exactly as described earlier in the
00400 'Residue Format: residue 1' section.  The following pseudocode
00401 presents the same algorithm. Assume:</para>
00402 
00403 <para>
00404 <itemizedlist>
00405 <listitem><simpara> <varname>[n]</varname> is the value in
00406 <varname>[residue_partition_size]</varname></simpara></listitem>
00407 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
00408 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
00409 </itemizedlist>
00410 </para>
00411 
00412 <programlisting>
00413  1) [i] = 0
00414  2) vector [entry_temp] = read vector from packet using current codebook in VQ context
00415  3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
00416 
00417       4) vector [v] element ([offset]+[i]) =
00418           vector [v] element ([offset]+[i]) +
00419           vector [entry_temp] element [j]
00420       5) increment [i]
00421 
00422     }
00423  
00424   6) if ( [i] is less than [n] ) continue at step 2
00425   7) done
00426 </programlisting>
00427 
00428 </section>
00429 
00430 <section><title>format 2 specifics</title>
00431  
00432 <para>
00433 Format 2 is reducible to format 1.  It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
00434 </para>
00435 
00436 <para>
00437 Format 2 handles 'do not decode' vectors differently than residue 0 or
00438 1; if all vectors are marked 'do not decode', no decode occurrs.
00439 However, if at least one vector is to be decoded, all the vectors are
00440 decoded.  We then request normal format 1 to decode a single vector
00441 representing all output channels, rather than a vector for each
00442 channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:</para>
00443 
00444 <orderedlist>
00445  <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
00446  <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
00447  <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
00448   <programlisting>
00449   1) iterate [i] over the range 0 ... [n]-1 {
00450 
00451        2) iterate [j] over the range 0 ... [ch]-1 {
00452 
00453             3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
00454 
00455           }
00456      }
00457 
00458   4) done
00459   </programlisting>
00460  </para></listitem>
00461 </orderedlist>
00462 
00463 </section>
00464 
00465 </section>
00466 
00467 </section>
00468 

Generated by  doxygen 1.6.2