00001 <?xml version="1.0" standalone="no"?>
00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
00003 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
00004
00005 ]>
00006
00007 <section id="vorbis-spec-residue">
00008 <sectioninfo>
00009 <releaseinfo>
00010 $Id: 08-residue.xml 10466 2005-11-28 00:34:44Z giles $
00011 </releaseinfo>
00012 </sectioninfo>
00013 <title>Residue setup and decode</title>
00014
00015
00016 <section>
00017 <title>Overview</title>
00018
00019 <para>
00020 A residue vector represents the fine detail of the audio spectrum of
00021 one channel in an audio frame after the encoder subtracts the floor
00022 curve and performs any channel coupling. A residue vector may
00023 represent spectral lines, spectral magnitude, spectral phase or
00024 hybrids as mixed by channel coupling. The exact semantic content of
00025 the vector does not matter to the residue abstraction.</para>
00026
00027 <para>
00028 Whatever the exact qualities, the Vorbis residue abstraction codes the
00029 residue vectors into the bitstream packet, and then reconstructs the
00030 vectors during decode. Vorbis makes use of three different encoding
00031 variants (numbered 0, 1 and 2) of the same basic vector encoding
00032 abstraction.</para>
00033
00034 </section>
00035
00036 <section>
00037 <title>Residue format</title>
00038
00039 <para>
00040 Residue format partitions each vector in the vector bundle into chunks,
00041 classifies each chunk, encodes the chunk classifications and finally
00042 encodes the chunks themselves using the the specific VQ arrangement
00043 defined for each selected classification.
00044 The exact interleaving and partitioning vary by residue encoding number,
00045 however the high-level process used to classify and encode the residue
00046 vector is the same in all three variants.</para>
00047
00048 <para>
00049 A set of coded residue vectors are all of the same length. High level
00050 coding structure, ignoring for the moment exactly how a partition is
00051 encoded and simply trusting that it is, is as follows:</para>
00052
00053 <para>
00054 <itemizedlist>
00055 <listitem><para>Each vector is partitioned into multiple equal sized chunks
00056 according to configuration specified. If we have a vector size of
00057 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
00058 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
00059 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>. It is
00060 important to note that the integer division truncates. In the below
00061 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
00062
00063 <listitem><para>Each partition in each vector has a classification number that
00064 specifies which of multiple configured VQ codebook setups are used to
00065 decode that partition. The classification numbers of each partition
00066 can be thought of as forming a vector in their own right, as in the
00067 illustration below. Just as the residue vectors are coded in grouped
00068 partitions to increase encoding efficiency, the classification vector
00069 is also partitioned into chunks. The integer elements of each scalar
00070 in a classification chunk are built into a single scalar that
00071 represents the classification numbers in that chunk. In the below
00072 example, the classification codeword encodes two classification
00073 numbers.</para></listitem>
00074
00075 <listitem><para>The values in a residue vector may be encoded monolithically in a
00076 single pass through the residue vector, but more often efficient
00077 codebook design dictates that each vector is encoded as the additive
00078 sum of several passes through the residue vector using more than one
00079 VQ codebook. Thus, each residue value potentially accumulates values
00080 from multiple decode passes. The classification value associated with
00081 a partition is the same in each pass, thus the classification codeword
00082 is coded only in the first pass.</para></listitem>
00083
00084 </itemizedlist>
00085 </para>
00086
00087 <mediaobject>
00088 <imageobject>
00089 <imagedata fileref="residue-pack.png" format="PNG"/>
00090 </imageobject>
00091 <textobject>
00092 <phrase>[illustration of residue vector format]</phrase>
00093 </textobject>
00094 </mediaobject>
00095
00096 </section>
00097
00098 <section><title>residue 0</title>
00099
00100 <para>
00101 Residue 0 and 1 differ only in the way the values within a residue
00102 partition are interleaved during partition encoding (visually treated
00103 as a black box--or cyan box or brown box--in the above figure).</para>
00104
00105 <para>
00106 Residue encoding 0 interleaves VQ encoding according to the
00107 dimension of the codebook used to encode a partition in a specific
00108 pass. The dimension of the codebook need not be the same in multiple
00109 passes, however the partition size must be an even multiple of the
00110 codebook dimension.</para>
00111
00112 <para>
00113 As an example, assume a partition vector of size eight, to be encoded
00114 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
00115
00116 <programlisting>
00117
00118 original residue vector: [ 0 1 2 3 4 5 6 7 ]
00119
00120 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
00121
00122 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
00123
00124 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
00125
00126 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
00127
00128 </programlisting>
00129
00130 <para>
00131 It is worth mentioning at this point that no configurable value in the
00132 residue coding setup is restricted to a power of two.</para>
00133
00134 </section>
00135
00136 <section><title>residue 1</title>
00137
00138 <para>
00139 Residue 1 does not interleave VQ encoding. It represents partition
00140 vector scalars in order. As with residue 0, however, partition length
00141 must be an integer multiple of the codebook dimension, although
00142 dimension may vary from pass to pass.</para>
00143
00144 <para>
00145 As an example, assume a partition vector of size eight, to be encoded
00146 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
00147
00148 <programlisting>
00149
00150 original residue vector: [ 0 1 2 3 4 5 6 7 ]
00151
00152 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
00153
00154 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
00155
00156 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
00157
00158 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
00159
00160 </programlisting>
00161
00162 </section>
00163
00164 <section><title>residue 2</title>
00165
00166 <para>
00167 Residue type two can be thought of as a variant of residue type 1.
00168 Rather than encoding multiple passed-in vectors as in residue type 1,
00169 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
00170 interleaved and flattened into a single vector of length
00171 <emphasis>ch</emphasis>*<emphasis>n</emphasis>. Encoding then proceeds as in type 1. Decoding is
00172 as in type 1 with decode interleave reversed. If operating on a single
00173 vector to begin with, residue type 1 and type 2 are equivalent.</para>
00174
00175 <mediaobject>
00176 <imageobject>
00177 <imagedata fileref="residue2.png" format="PNG"/>
00178 </imageobject>
00179 <textobject>
00180 <phrase>[illustration of residue type 2]</phrase>
00181 </textobject>
00182 </mediaobject>
00183
00184 </section>
00185
00186 <section>
00187 <title>Residue decode</title>
00188
00189 <section><title>header decode</title>
00190
00191 <para>
00192 Header decode for all three residue types is identical.</para>
00193 <programlisting>
00194 1) [residue_begin] = read 24 bits as unsigned integer
00195 2) [residue_end] = read 24 bits as unsigned integer
00196 3) [residue_partition_size] = read 24 bits as unsigned integer and add one
00197 4) [residue_classifications] = read 6 bits as unsigned integer and add one
00198 5) [residue_classbook] = read 8 bits as unsigned integer
00199 </programlisting>
00200
00201 <para>
00202 <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
00203 sub-portion of each vector that is actually coded; it implements akin
00204 to a bandpass where, for coding purposes, the vector effectively
00205 begins at element <varname>[residue_begin]</varname> and ends at
00206 <varname>[residue_end]</varname>. Preceding and following values in the unpacked
00207 vectors are zeroed. Note that for residue type 2, these values as
00208 well as <varname>[residue_partition_size]</varname>apply to the interleaved
00209 vector, not the individual vectors before interleave.
00210 <varname>[residue_partition_size]</varname> is as explained above,
00211 <varname>[residue_classifications]</varname> is the number of possible
00212 classification to which a partition can belong and
00213 <varname>[residue_classbook]</varname> is the codebook number used to code
00214 classification codewords. The number of dimensions in book
00215 <varname>[residue_classbook]</varname> determines how many classification values
00216 are grouped into a single classification codeword.</para>
00217
00218 <para>
00219 Next we read a bitmap pattern that specifies which partition classes
00220 code values in which passes.</para>
00221
00222 <programlisting>
00223 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
00224
00225 2) [high_bits] = 0
00226 3) [low_bits] = read 3 bits as unsigned integer
00227 4) [bitflag] = read one bit as boolean
00228 5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
00229 6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
00230 }
00231 7) done
00232 </programlisting>
00233
00234 <para>
00235 Finally, we read in a list of book numbers, each corresponding to
00236 specific bit set in the cascade bitmap. We loop over the possible
00237 codebook classifications and the maximum possible number of encoding
00238 stages (8 in Vorbis I, as constrained by the elements of the cascade
00239 bitmap being eight bits):</para>
00240
00241 <programlisting>
00242 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
00243
00244 2) iterate [j] over the range 0 ... 7 {
00245
00246 3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
00247
00248 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
00249
00250 } else {
00251
00252 5) array [residue_books] element [i][j] = unused
00253
00254 }
00255 }
00256 }
00257
00258 6) done
00259 </programlisting>
00260
00261 <para>
00262 An end-of-packet condition at any point in header decode renders the
00263 stream undecodable. In addition, any codebook number greater than the
00264 maximum numbered codebook set up in this stream also renders the
00265 stream undecodable.</para>
00266
00267 </section>
00268
00269 <section><title>packet decode</title>
00270
00271 <para>
00272 Format 0 and 1 packet decode is identical except for specific
00273 partition interleave. Format 2 packet decode can be built out of the
00274 format 1 decode process. Thus we describe first the decode
00275 infrastructure identical to all three formats.</para>
00276
00277 <para>
00278 In addition to configuration information, the residue decode process
00279 is passed the number of vectors in the submap bundle and a vector of
00280 flags indicating if any of the vectors are not to be decoded. If the
00281 passed in number of vectors is 3 and vector number 1 is marked 'do not
00282 decode', decode skips vector 1 during the decode loop. However, even
00283 'do not decode' vectors are allocated and zeroed.</para>
00284
00285 <para>
00286 The following convenience values are conceptually useful to clarifying
00287 the decode process:</para>
00288
00289 <programlisting>
00290 1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
00291 2) [n_to_read] = [residue_end] - [residue_begin]
00292 3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
00293 </programlisting>
00294
00295 <para>
00296 Packet decode proceeds as follows, matching the description offered earlier in the document. We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para>
00297 <programlisting>
00298 1) allocate and zero all vectors that will be returned.
00299 2) iterate [pass] over the range 0 ... 7 {
00300
00301 3) [partition_count] = 0
00302
00303 4) while [partition_count] is less than [partitions_to_read]
00304
00305 5) if ([pass] is zero) {
00306
00307 6) iterate [j] over the range 0 .. [ch]-1 {
00308
00309 7) if vector [j] is not marked 'do not decode' {
00310
00311 8) [temp] = read from packet using codebook [residue_classbook] in scalar context
00312 9) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
00313
00314 10) array [classifications] element [j],([i]+[partition_count]) =
00315 [temp] integer modulo [residue_classifications]
00316 11) [temp] = [temp] / [residue_classifications] using integer division
00317
00318 }
00319
00320 }
00321
00322 }
00323
00324 }
00325
00326 12) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
00327 is also less than [partitions_to_read] {
00328
00329 13) iterate [j] over the range 0 .. [ch]-1 {
00330
00331 14) if vector [j] is not marked 'do not decode' {
00332
00333 15) [vqclass] = array [classifications] element [j],[partition_count]
00334 16) [vqbook] = array [residue_books] element [vqclass],[pass]
00335 17) if ([vqbook] is not 'unused') {
00336
00337 18) decode partition into output vector number [j], starting at scalar
00338 offset [residue_begin]+[partition_count]*[residue_partition_size] using
00339 codebook number [vqbook] in VQ context
00340 }
00341 }
00342
00343 19) increment [partition_count] by one
00344
00345 }
00346 }
00347 }
00348
00349 20) done
00350
00351 </programlisting>
00352
00353 <para>
00354 An end-of-packet condition during packet decode is to be considered a
00355 nominal occurrence. Decode returns the result of vector decode up to
00356 that point.</para>
00357
00358 </section>
00359
00360 <section><title>format 0 specifics</title>
00361
00362 <para>
00363 Format zero decodes partitions exactly as described earlier in the
00364 'Residue Format: residue 0' section. The following pseudocode
00365 presents the same algorithm. Assume:</para>
00366
00367 <para>
00368 <itemizedlist>
00369 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
00370 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
00371 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
00372 </itemizedlist>
00373 </para>
00374
00375 <programlisting>
00376 1) [step] = [n] / [codebook_dimensions]
00377 2) iterate [i] over the range 0 ... [step]-1 {
00378
00379 3) vector [entry_temp] = read vector from packet using current codebook in VQ context
00380 4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
00381
00382 5) vector [v] element ([offset]+[i]+[j]*[step]) =
00383 vector [v] element ([offset]+[i]+[j]*[step]) +
00384 vector [entry_temp] element [j]
00385
00386 }
00387
00388 }
00389
00390 6) done
00391
00392 </programlisting>
00393
00394 </section>
00395
00396 <section><title>format 1 specifics</title>
00397
00398 <para>
00399 Format 1 decodes partitions exactly as described earlier in the
00400 'Residue Format: residue 1' section. The following pseudocode
00401 presents the same algorithm. Assume:</para>
00402
00403 <para>
00404 <itemizedlist>
00405 <listitem><simpara> <varname>[n]</varname> is the value in
00406 <varname>[residue_partition_size]</varname></simpara></listitem>
00407 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
00408 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
00409 </itemizedlist>
00410 </para>
00411
00412 <programlisting>
00413 1) [i] = 0
00414 2) vector [entry_temp] = read vector from packet using current codebook in VQ context
00415 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
00416
00417 4) vector [v] element ([offset]+[i]) =
00418 vector [v] element ([offset]+[i]) +
00419 vector [entry_temp] element [j]
00420 5) increment [i]
00421
00422 }
00423
00424 6) if ( [i] is less than [n] ) continue at step 2
00425 7) done
00426 </programlisting>
00427
00428 </section>
00429
00430 <section><title>format 2 specifics</title>
00431
00432 <para>
00433 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
00434 </para>
00435
00436 <para>
00437 Format 2 handles 'do not decode' vectors differently than residue 0 or
00438 1; if all vectors are marked 'do not decode', no decode occurrs.
00439 However, if at least one vector is to be decoded, all the vectors are
00440 decoded. We then request normal format 1 to decode a single vector
00441 representing all output channels, rather than a vector for each
00442 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:</para>
00443
00444 <orderedlist>
00445 <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
00446 <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
00447 <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
00448 <programlisting>
00449 1) iterate [i] over the range 0 ... [n]-1 {
00450
00451 2) iterate [j] over the range 0 ... [ch]-1 {
00452
00453 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
00454
00455 }
00456 }
00457
00458 4) done
00459 </programlisting>
00460 </para></listitem>
00461 </orderedlist>
00462
00463 </section>
00464
00465 </section>
00466
00467 </section>
00468