00001 <?xml version="1.0" standalone="no"?> 00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 00003 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ 00004 00005 ]> 00006 00007 <section id="vorbis-spec-residue"> 00008 <sectioninfo> 00009 <releaseinfo> 00010 $Id: 08-residue.xml 10466 2005-11-28 00:34:44Z giles $ 00011 </releaseinfo> 00012 </sectioninfo> 00013 <title>Residue setup and decode</title> 00014 00015 00016 <section> 00017 <title>Overview</title> 00018 00019 <para> 00020 A residue vector represents the fine detail of the audio spectrum of 00021 one channel in an audio frame after the encoder subtracts the floor 00022 curve and performs any channel coupling. A residue vector may 00023 represent spectral lines, spectral magnitude, spectral phase or 00024 hybrids as mixed by channel coupling. The exact semantic content of 00025 the vector does not matter to the residue abstraction.</para> 00026 00027 <para> 00028 Whatever the exact qualities, the Vorbis residue abstraction codes the 00029 residue vectors into the bitstream packet, and then reconstructs the 00030 vectors during decode. Vorbis makes use of three different encoding 00031 variants (numbered 0, 1 and 2) of the same basic vector encoding 00032 abstraction.</para> 00033 00034 </section> 00035 00036 <section> 00037 <title>Residue format</title> 00038 00039 <para> 00040 Residue format partitions each vector in the vector bundle into chunks, 00041 classifies each chunk, encodes the chunk classifications and finally 00042 encodes the chunks themselves using the the specific VQ arrangement 00043 defined for each selected classification. 00044 The exact interleaving and partitioning vary by residue encoding number, 00045 however the high-level process used to classify and encode the residue 00046 vector is the same in all three variants.</para> 00047 00048 <para> 00049 A set of coded residue vectors are all of the same length. High level 00050 coding structure, ignoring for the moment exactly how a partition is 00051 encoded and simply trusting that it is, is as follows:</para> 00052 00053 <para> 00054 <itemizedlist> 00055 <listitem><para>Each vector is partitioned into multiple equal sized chunks 00056 according to configuration specified. If we have a vector size of 00057 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total 00058 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks 00059 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>. It is 00060 important to note that the integer division truncates. In the below 00061 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem> 00062 00063 <listitem><para>Each partition in each vector has a classification number that 00064 specifies which of multiple configured VQ codebook setups are used to 00065 decode that partition. The classification numbers of each partition 00066 can be thought of as forming a vector in their own right, as in the 00067 illustration below. Just as the residue vectors are coded in grouped 00068 partitions to increase encoding efficiency, the classification vector 00069 is also partitioned into chunks. The integer elements of each scalar 00070 in a classification chunk are built into a single scalar that 00071 represents the classification numbers in that chunk. In the below 00072 example, the classification codeword encodes two classification 00073 numbers.</para></listitem> 00074 00075 <listitem><para>The values in a residue vector may be encoded monolithically in a 00076 single pass through the residue vector, but more often efficient 00077 codebook design dictates that each vector is encoded as the additive 00078 sum of several passes through the residue vector using more than one 00079 VQ codebook. Thus, each residue value potentially accumulates values 00080 from multiple decode passes. The classification value associated with 00081 a partition is the same in each pass, thus the classification codeword 00082 is coded only in the first pass.</para></listitem> 00083 00084 </itemizedlist> 00085 </para> 00086 00087 <mediaobject> 00088 <imageobject> 00089 <imagedata fileref="residue-pack.png" format="PNG"/> 00090 </imageobject> 00091 <textobject> 00092 <phrase>[illustration of residue vector format]</phrase> 00093 </textobject> 00094 </mediaobject> 00095 00096 </section> 00097 00098 <section><title>residue 0</title> 00099 00100 <para> 00101 Residue 0 and 1 differ only in the way the values within a residue 00102 partition are interleaved during partition encoding (visually treated 00103 as a black box--or cyan box or brown box--in the above figure).</para> 00104 00105 <para> 00106 Residue encoding 0 interleaves VQ encoding according to the 00107 dimension of the codebook used to encode a partition in a specific 00108 pass. The dimension of the codebook need not be the same in multiple 00109 passes, however the partition size must be an even multiple of the 00110 codebook dimension.</para> 00111 00112 <para> 00113 As an example, assume a partition vector of size eight, to be encoded 00114 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para> 00115 00116 <programlisting> 00117 00118 original residue vector: [ 0 1 2 3 4 5 6 7 ] 00119 00120 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] 00121 00122 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ] 00123 00124 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ] 00125 00126 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] 00127 00128 </programlisting> 00129 00130 <para> 00131 It is worth mentioning at this point that no configurable value in the 00132 residue coding setup is restricted to a power of two.</para> 00133 00134 </section> 00135 00136 <section><title>residue 1</title> 00137 00138 <para> 00139 Residue 1 does not interleave VQ encoding. It represents partition 00140 vector scalars in order. As with residue 0, however, partition length 00141 must be an integer multiple of the codebook dimension, although 00142 dimension may vary from pass to pass.</para> 00143 00144 <para> 00145 As an example, assume a partition vector of size eight, to be encoded 00146 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para> 00147 00148 <programlisting> 00149 00150 original residue vector: [ 0 1 2 3 4 5 6 7 ] 00151 00152 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] 00153 00154 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ] 00155 00156 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ] 00157 00158 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] 00159 00160 </programlisting> 00161 00162 </section> 00163 00164 <section><title>residue 2</title> 00165 00166 <para> 00167 Residue type two can be thought of as a variant of residue type 1. 00168 Rather than encoding multiple passed-in vectors as in residue type 1, 00169 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first 00170 interleaved and flattened into a single vector of length 00171 <emphasis>ch</emphasis>*<emphasis>n</emphasis>. Encoding then proceeds as in type 1. Decoding is 00172 as in type 1 with decode interleave reversed. If operating on a single 00173 vector to begin with, residue type 1 and type 2 are equivalent.</para> 00174 00175 <mediaobject> 00176 <imageobject> 00177 <imagedata fileref="residue2.png" format="PNG"/> 00178 </imageobject> 00179 <textobject> 00180 <phrase>[illustration of residue type 2]</phrase> 00181 </textobject> 00182 </mediaobject> 00183 00184 </section> 00185 00186 <section> 00187 <title>Residue decode</title> 00188 00189 <section><title>header decode</title> 00190 00191 <para> 00192 Header decode for all three residue types is identical.</para> 00193 <programlisting> 00194 1) [residue_begin] = read 24 bits as unsigned integer 00195 2) [residue_end] = read 24 bits as unsigned integer 00196 3) [residue_partition_size] = read 24 bits as unsigned integer and add one 00197 4) [residue_classifications] = read 6 bits as unsigned integer and add one 00198 5) [residue_classbook] = read 8 bits as unsigned integer 00199 </programlisting> 00200 00201 <para> 00202 <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific 00203 sub-portion of each vector that is actually coded; it implements akin 00204 to a bandpass where, for coding purposes, the vector effectively 00205 begins at element <varname>[residue_begin]</varname> and ends at 00206 <varname>[residue_end]</varname>. Preceding and following values in the unpacked 00207 vectors are zeroed. Note that for residue type 2, these values as 00208 well as <varname>[residue_partition_size]</varname>apply to the interleaved 00209 vector, not the individual vectors before interleave. 00210 <varname>[residue_partition_size]</varname> is as explained above, 00211 <varname>[residue_classifications]</varname> is the number of possible 00212 classification to which a partition can belong and 00213 <varname>[residue_classbook]</varname> is the codebook number used to code 00214 classification codewords. The number of dimensions in book 00215 <varname>[residue_classbook]</varname> determines how many classification values 00216 are grouped into a single classification codeword.</para> 00217 00218 <para> 00219 Next we read a bitmap pattern that specifies which partition classes 00220 code values in which passes.</para> 00221 00222 <programlisting> 00223 1) iterate [i] over the range 0 ... [residue_classifications]-1 { 00224 00225 2) [high_bits] = 0 00226 3) [low_bits] = read 3 bits as unsigned integer 00227 4) [bitflag] = read one bit as boolean 00228 5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer 00229 6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits] 00230 } 00231 7) done 00232 </programlisting> 00233 00234 <para> 00235 Finally, we read in a list of book numbers, each corresponding to 00236 specific bit set in the cascade bitmap. We loop over the possible 00237 codebook classifications and the maximum possible number of encoding 00238 stages (8 in Vorbis I, as constrained by the elements of the cascade 00239 bitmap being eight bits):</para> 00240 00241 <programlisting> 00242 1) iterate [i] over the range 0 ... [residue_classifications]-1 { 00243 00244 2) iterate [j] over the range 0 ... 7 { 00245 00246 3) if ( vector [residue_cascade] element [i] bit [j] is set ) { 00247 00248 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer 00249 00250 } else { 00251 00252 5) array [residue_books] element [i][j] = unused 00253 00254 } 00255 } 00256 } 00257 00258 6) done 00259 </programlisting> 00260 00261 <para> 00262 An end-of-packet condition at any point in header decode renders the 00263 stream undecodable. In addition, any codebook number greater than the 00264 maximum numbered codebook set up in this stream also renders the 00265 stream undecodable.</para> 00266 00267 </section> 00268 00269 <section><title>packet decode</title> 00270 00271 <para> 00272 Format 0 and 1 packet decode is identical except for specific 00273 partition interleave. Format 2 packet decode can be built out of the 00274 format 1 decode process. Thus we describe first the decode 00275 infrastructure identical to all three formats.</para> 00276 00277 <para> 00278 In addition to configuration information, the residue decode process 00279 is passed the number of vectors in the submap bundle and a vector of 00280 flags indicating if any of the vectors are not to be decoded. If the 00281 passed in number of vectors is 3 and vector number 1 is marked 'do not 00282 decode', decode skips vector 1 during the decode loop. However, even 00283 'do not decode' vectors are allocated and zeroed.</para> 00284 00285 <para> 00286 The following convenience values are conceptually useful to clarifying 00287 the decode process:</para> 00288 00289 <programlisting> 00290 1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook] 00291 2) [n_to_read] = [residue_end] - [residue_begin] 00292 3) [partitions_to_read] = [n_to_read] / [residue_partition_size] 00293 </programlisting> 00294 00295 <para> 00296 Packet decode proceeds as follows, matching the description offered earlier in the document. We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para> 00297 <programlisting> 00298 1) allocate and zero all vectors that will be returned. 00299 2) iterate [pass] over the range 0 ... 7 { 00300 00301 3) [partition_count] = 0 00302 00303 4) while [partition_count] is less than [partitions_to_read] 00304 00305 5) if ([pass] is zero) { 00306 00307 6) iterate [j] over the range 0 .. [ch]-1 { 00308 00309 7) if vector [j] is not marked 'do not decode' { 00310 00311 8) [temp] = read from packet using codebook [residue_classbook] in scalar context 00312 9) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 { 00313 00314 10) array [classifications] element [j],([i]+[partition_count]) = 00315 [temp] integer modulo [residue_classifications] 00316 11) [temp] = [temp] / [residue_classifications] using integer division 00317 00318 } 00319 00320 } 00321 00322 } 00323 00324 } 00325 00326 12) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count] 00327 is also less than [partitions_to_read] { 00328 00329 13) iterate [j] over the range 0 .. [ch]-1 { 00330 00331 14) if vector [j] is not marked 'do not decode' { 00332 00333 15) [vqclass] = array [classifications] element [j],[partition_count] 00334 16) [vqbook] = array [residue_books] element [vqclass],[pass] 00335 17) if ([vqbook] is not 'unused') { 00336 00337 18) decode partition into output vector number [j], starting at scalar 00338 offset [residue_begin]+[partition_count]*[residue_partition_size] using 00339 codebook number [vqbook] in VQ context 00340 } 00341 } 00342 00343 19) increment [partition_count] by one 00344 00345 } 00346 } 00347 } 00348 00349 20) done 00350 00351 </programlisting> 00352 00353 <para> 00354 An end-of-packet condition during packet decode is to be considered a 00355 nominal occurrence. Decode returns the result of vector decode up to 00356 that point.</para> 00357 00358 </section> 00359 00360 <section><title>format 0 specifics</title> 00361 00362 <para> 00363 Format zero decodes partitions exactly as described earlier in the 00364 'Residue Format: residue 0' section. The following pseudocode 00365 presents the same algorithm. Assume:</para> 00366 00367 <para> 00368 <itemizedlist> 00369 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem> 00370 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem> 00371 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem> 00372 </itemizedlist> 00373 </para> 00374 00375 <programlisting> 00376 1) [step] = [n] / [codebook_dimensions] 00377 2) iterate [i] over the range 0 ... [step]-1 { 00378 00379 3) vector [entry_temp] = read vector from packet using current codebook in VQ context 00380 4) iterate [j] over the range 0 ... [codebook_dimensions]-1 { 00381 00382 5) vector [v] element ([offset]+[i]+[j]*[step]) = 00383 vector [v] element ([offset]+[i]+[j]*[step]) + 00384 vector [entry_temp] element [j] 00385 00386 } 00387 00388 } 00389 00390 6) done 00391 00392 </programlisting> 00393 00394 </section> 00395 00396 <section><title>format 1 specifics</title> 00397 00398 <para> 00399 Format 1 decodes partitions exactly as described earlier in the 00400 'Residue Format: residue 1' section. The following pseudocode 00401 presents the same algorithm. Assume:</para> 00402 00403 <para> 00404 <itemizedlist> 00405 <listitem><simpara> <varname>[n]</varname> is the value in 00406 <varname>[residue_partition_size]</varname></simpara></listitem> 00407 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem> 00408 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem> 00409 </itemizedlist> 00410 </para> 00411 00412 <programlisting> 00413 1) [i] = 0 00414 2) vector [entry_temp] = read vector from packet using current codebook in VQ context 00415 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 { 00416 00417 4) vector [v] element ([offset]+[i]) = 00418 vector [v] element ([offset]+[i]) + 00419 vector [entry_temp] element [j] 00420 5) increment [i] 00421 00422 } 00423 00424 6) if ( [i] is less than [n] ) continue at step 2 00425 7) done 00426 </programlisting> 00427 00428 </section> 00429 00430 <section><title>format 2 specifics</title> 00431 00432 <para> 00433 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode. 00434 </para> 00435 00436 <para> 00437 Format 2 handles 'do not decode' vectors differently than residue 0 or 00438 1; if all vectors are marked 'do not decode', no decode occurrs. 00439 However, if at least one vector is to be decoded, all the vectors are 00440 decoded. We then request normal format 1 to decode a single vector 00441 representing all output channels, rather than a vector for each 00442 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:</para> 00443 00444 <orderedlist> 00445 <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem> 00446 <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem> 00447 <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to: 00448 <programlisting> 00449 1) iterate [i] over the range 0 ... [n]-1 { 00450 00451 2) iterate [j] over the range 0 ... [ch]-1 { 00452 00453 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j]) 00454 00455 } 00456 } 00457 00458 4) done 00459 </programlisting> 00460 </para></listitem> 00461 </orderedlist> 00462 00463 </section> 00464 00465 </section> 00466 00467 </section> 00468