00001 <?xml version="1.0" standalone="no"?> 00002 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 00003 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [ 00004 00005 ]> 00006 00007 <section id="vorbis-spec-floor0"> 00008 <sectioninfo> 00009 <releaseinfo> 00010 $Id: 06-floor0.xml 10424 2005-11-23 08:44:18Z xiphmont $ 00011 </releaseinfo> 00012 </sectioninfo> 00013 <title>Floor type 0 setup and decode</title> 00014 00015 00016 <section> 00017 <title>Overview</title> 00018 00019 <para> 00020 Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately 00021 known as Line Spectral Frequency or LSF) representation to encode a 00022 smooth spectral envelope curve as the frequency response of the LSP 00023 filter. This representation is equivalent to a traditional all-pole 00024 infinite impulse response filter as would be used in linear predictive 00025 coding; LSP representation may be converted to LPC representation and 00026 vice-versa.</para> 00027 00028 </section> 00029 00030 <section> 00031 <title>Floor 0 format</title> 00032 00033 <para> 00034 Floor zero configuration consists of six integer fields and a list of 00035 VQ codebooks for use in coding/decoding the LSP filter coefficient 00036 values used by each frame. </para> 00037 00038 <section><title>header decode</title> 00039 00040 <para> 00041 Configuration information for instances of floor zero decodes from the 00042 codec setup header (third packet). configuration decode proceeds as 00043 follows:</para> 00044 00045 <screen> 00046 1) [floor0_order] = read an unsigned integer of 8 bits 00047 2) [floor0_rate] = read an unsigned integer of 16 bits 00048 3) [floor0_bark_map_size] = read an unsigned integer of 16 bits 00049 4) [floor0_amplitude_bits] = read an unsigned integer of six bits 00050 5) [floor0_amplitude_offset] = read an unsigned integer of eight bits 00051 6) [floor0_number_of_books] = read an unsigned integer of four bits and add 1 00052 7) if any of [floor0_order], [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits], 00053 [floor0_amplitude_offset] or [floor0_number_of_books] are less than zero, the stream is not decodable 00054 8) array [floor0_book_list] = read a list of [floor0_number_of_books] unsigned integers of eight bits each; 00055 </screen> 00056 00057 <para> 00058 An end-of-packet condition during any of these bitstream reads renders 00059 this stream undecodable. In addition, any element of the array 00060 <varname>[floor0_book_list]</varname> that is greater than the maximum codebook 00061 number for this bitstream is an error condition that also renders the 00062 stream undecodable.</para> 00063 00064 </section> 00065 00066 <section id="vorbis-spec-floor0-decode"> 00067 <title>packet decode</title> 00068 00069 <para> 00070 Extracting a floor0 curve from an audio packet consists of first 00071 decoding the curve amplitude and <varname>[floor0_order]</varname> LSP 00072 coefficient values from the bitstream, and then computing the floor 00073 curve, which is defined as the frequency response of the decoded LSP 00074 filter.</para> 00075 00076 <para> 00077 Packet decode proceeds as follows:</para> 00078 <screen> 00079 1) [amplitude] = read an unsigned integer of [floor0_amplitude_bits] bits 00080 2) if ( [amplitude] is greater than zero ) { 00081 3) [coefficients] is an empty, zero length vector 00082 4) [booknumber] = read an unsigned integer of <link linkend="vorbis-spec-ilog">ilog</link>( [floor0_number_of_books] ) bits 00083 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable 00084 6) [last] = zero; 00085 7) vector [temp_vector] = read vector from bitstream using codebook number [floor0_book_list] element [booknumber] in VQ context. 00086 8) add the scalar value [last] to each scalar in vector [temp_vector] 00087 9) [last] = the value of the last scalar in vector [temp_vector] 00088 10) concatenate [temp_vector] onto the end of the [coefficients] vector 00089 11) if (length of vector [coefficients] is less than [floor0_order], continue at step 6 00090 00091 } 00092 00093 12) done. 00094 00095 </screen> 00096 00097 <para> 00098 Take note of the following properties of decode: 00099 <itemizedlist> 00100 <listitem><simpara>An <varname>[amplitude]</varname> value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don't occur for an unused channel.</simpara></listitem> 00101 <listitem><simpara>An end-of-packet condition during decode should be considered a 00102 nominal occruence; if end-of-packet is reached during any read 00103 operation above, floor decode is to return 'unused' status as if the 00104 <varname>[amplitude]</varname> value had read zero at the beginning of decode.</simpara></listitem> 00105 00106 <listitem><simpara>The book number used for decode 00107 can, in fact, be stored in the bitstream in <link linkend="vorbis-spec-ilog">ilog</link>( <varname>[floor0_number_of_books]</varname> - 00108 1 ) bits. Nevertheless, the above specification is correct and values 00109 greater than the maximum possible book value are reserved.</simpara></listitem> 00110 00111 <listitem><simpara>The number of scalars read into the vector <varname>[coefficients]</varname> 00112 may be greater than <varname>[floor0_order]</varname>, the number actually 00113 required for curve computation. For example, if the VQ codebook used 00114 for the floor currently being decoded has a 00115 <varname>[codebook_dimensions]</varname> value of three and 00116 <varname>[floor0_order]</varname> is ten, the only way to fill all the needed 00117 scalars in <varname>[coefficients]</varname> is to to read a total of twelve 00118 scalars as four vectors of three scalars each. This is not an error 00119 condition, and care must be taken not to allow a buffer overflow in 00120 decode. The extra values are not used and may be ignored or discarded.</simpara></listitem> 00121 </itemizedlist> 00122 </para> 00123 00124 </section> 00125 00126 <section id="vorbis-spec-floor0-synth"> 00127 <title>curve computation</title> 00128 00129 <para> 00130 Given an <varname>[amplitude]</varname> integer and <varname>[coefficients]</varname> 00131 vector from packet decode as well as the [floor0_order], 00132 [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits] and 00133 [floor0_amplitude_offset] values from floor setup, and an output 00134 vector size <varname>[n]</varname> specified by the decode process, we compute a 00135 floor output vector.</para> 00136 00137 <para> 00138 If the value <varname>[amplitude]</varname> is zero, the return value is a 00139 length <varname>[n]</varname> vector with all-zero scalars. Otherwise, begin by 00140 assuming the following definitions for the given vector to be 00141 synthesized:</para> 00142 00143 <informalequation> 00144 <mediaobject> 00145 <textobject><phrase>[lsp map equation]</phrase></textobject> 00146 <textobject role="tex"><phrase> 00147 <![CDATA[ 00148 \begin{math} 00149 \mathrm{map}_i = \left\{ 00150 \begin{array}{ll} 00151 \min ( 00152 \mathtt{floor0\_bark\_map\_size} - 1, 00153 foobar 00154 ) & \textrm{for } i \in [0,n-1] \\ 00155 -1 & \textrm{for } i = n 00156 \end{array} 00157 \right. 00158 \end {math} 00159 00160 where 00161 00162 \begin{math} 00163 foobar = 00164 \left\lfloor 00165 \mathrm{bark}\left(\frac{\mathtt{floor0\_rate} \cdot i}{2n}\right) \cdot \frac{\mathtt{floor0\_bark\_map\_size}} {\mathrm{bark}(.5 \cdot \mathtt{floor0\_rate})} 00166 \right\rfloor 00167 \end{math} 00168 00169 and 00170 00171 \begin{math} 00172 \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2 + .0001x) 00173 \end{math} 00174 ]]> 00175 </phrase></textobject> 00176 <imageobject><imagedata fileref="lspmap.png"/></imageobject> 00177 </mediaobject> 00178 </informalequation> 00179 00180 <para> 00181 The above is used to synthesize the LSP curve on a Bark-scale frequency 00182 axis, then map the result to a linear-scale frequency axis. 00183 Similarly, the below calculation synthesizes the output LSP curve <varname>[output]</varname> on a log 00184 (dB) amplitude scale, mapping it to linear amplitude in the last step:</para> 00185 00186 <orderedlist> 00187 <listitem><simpara> <varname>[i]</varname> = 0 </simpara></listitem> 00188 <listitem><para>if ( <varname>[floor0_order]</varname> is odd ) { 00189 <orderedlist> 00190 <listitem><para>calculate <varname>[p]</varname> and <varname>[q]</varname> according to: 00191 <informalequation> 00192 <mediaobject> 00193 <textobject><phrase>[equation for odd lsp]</phrase></textobject> 00194 <textobject role="tex"><phrase> 00195 <![CDATA[ 00196 \begin{eqnarray*} 00197 p & = & (1 - \cos^2\omega)\prod_{j=0}^{(\mathtt{order}-3)/2} 4 (\cos c_{2j+1} - \cos \omega)^2 \\ 00198 q & = & \frac{1}{4} \prod_{j=0}^{(\mathtt{order}-1)/2} 4 (\cos c_{2j+1} - \cos \omega)^2 00199 \end{eqnarray*} 00200 ]]> 00201 </phrase></textobject> 00202 <imageobject><imagedata fileref="oddlsp.png"/></imageobject> 00203 </mediaobject> 00204 </informalequation> 00205 </para></listitem> 00206 </orderedlist> 00207 } else <varname>[floor0_order]</varname> is even { 00208 <orderedlist> 00209 <listitem><para>calculate <varname>[p]</varname> and <varname>[q]</varname> according to: 00210 <informalequation> 00211 <mediaobject> 00212 <textobject><phrase>[equation for even lsp]</phrase></textobject> 00213 <textobject role="tex"><phrase> 00214 <![CDATA[ 00215 \begin{eqnarray*} 00216 p & = & \frac{(1 - \cos^2\omega)}{2} \prod_{j=0}^{(\mathtt{order}-2)/2} 4 (\cos c_{2j} - \cos \omega)^2 \\ 00217 q & = & \frac{(1 + \cos^2\omega)}{2} \prod_{j=0}^{(\mathtt{order}-2)/2} 4 (\cos c_{2j} - \cos \omega)^2 00218 \end{eqnarray*} 00219 ]]> 00220 </phrase></textobject> 00221 <imageobject><imagedata fileref="evenlsp.png"/></imageobject> 00222 </mediaobject> 00223 </informalequation> 00224 </para></listitem> 00225 </orderedlist> 00226 } 00227 </para></listitem> 00228 <listitem><para>calculate <varname>[linear_floor_value]</varname> according to: 00229 <informalequation> 00230 <mediaobject> 00231 <textobject><phrase>[expression for floorval]</phrase></textobject> 00232 <textobject role="tex"><phrase> 00233 <![CDATA[ 00234 \begin{math} 00235 \exp \left( .11512925 \left(\frac{\mathtt{amplitude} \cdot \mathtt{floor0\_amplitute\_offset}}{(2^{\mathtt{floor0\_amplitude\_bits}}-1)\sqrt{p+q}} 00236 - \mathtt{floor0\_amplitude\_offset} \right) \right) 00237 \end{math} 00238 ]]> 00239 </phrase></textobject> 00240 <imageobject><imagedata fileref="floorval.png"/></imageobject> 00241 </mediaobject> 00242 </informalequation> 00243 </para></listitem> 00244 <listitem><simpara><varname>[iteration_condition]</varname> = map element <varname>[i]</varname></simpara></listitem> 00245 <listitem><simpara><varname>[output]</varname> element <varname>[i]</varname> = <varname>[linear_floor_value]</varname></simpara></listitem> 00246 <listitem><simpara>increment <varname>[i]</varname></simpara></listitem> 00247 <listitem><simpara>if ( map element <varname>[i]</varname> is equal to <varname>[iteration_condition]</varname> ) continue at step 5</simpara></listitem> 00248 <listitem><simpara>if ( <varname>[i]</varname> is less than <varname>[n]</varname> ) continue at step 2</simpara></listitem> 00249 <listitem><simpara>done</simpara></listitem> 00250 </orderedlist> 00251 00252 </section> 00253 00254 </section> 00255 00256 </section> 00257