examples/SFExamples/oggvorbiscodec94/src/libvorbis/doc/vorbisenc/overview.html

00001 <html>
00002 
00003 <head>
00004 <title>libvorbisenc - API Overview</title>
00005 <link rel=stylesheet href="style.css" type="text/css">
00006 </head>
00007 
00008 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
00009 <table border=0 width=100%>
00010 <tr>
00011 <td><p class=tiny>libvorbisenc documentation</p></td>
00012 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
00013 </tr>
00014 </table>
00015 
00016 <h1>Libvorbisenc API Overview</h1>
00017 
00018 <p>Libvorbisenc is an encoding convenience library intended to
00019 encapsulate the elaborate setup that libvorbis requires for encoding.
00020 Libvorbisenc gives easy access to all high-level adjustments an
00021 application may require when encoding and also exposes some low-level
00022 tuning parameters to allow applications to make detailed adjustments
00023 to the encoding process. <p>
00024 
00025 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
00026 
00027 <em>Note: libvorbis and libvorbisenc always
00028 encode in a single pass. Thus, all possible encoding setups will work
00029 properly with live input and produce streams that decode properly when
00030 streamed.  See the subsection titled <a href="#BBR">"managed bitrate
00031 modes"</a> for details on setting limits on bitrate usage when Vorbis
00032 streams are used in a limited-bandwidth environment.</em>
00033 
00034 <h2>workflow</h2>
00035 
00036 <p>Libvorbisenc is used only during encoder setup; its function
00037 is to automate initialization of a multitude of settings in a
00038 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference
00039 during the encoding process.  Libvorbisenc plays no part in the
00040 encoding process after setup.
00041 
00042 <p>Encode setup using libvorbisenc consists of three steps: 
00043 
00044 <ol>
00045 <li>high-level initialization of a <tt>vorbis_info</tt> structure by
00046 calling one of <a
00047 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
00048 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
00049 with the basic input audio parameters (rate and channels) and the
00050 basic desired encoded audio output parameters (VBR quality or ABR/CBR
00051 bitrate)<p>
00052 
00053 <li>optional adjustment of the basic setup defaults using <a
00054 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
00055 
00056 <li>calling <a
00057 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
00058 finalize the high-level setup into the detailed low-level reference
00059 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
00060 structure is then ready to use for encoding by libvorbis.<p>
00061 
00062 </ol>
00063 
00064 These three steps can be collapsed into a single call by using <a
00065 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
00066 quality-based VBR stream or <a
00067 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
00068 bitrate (ABR or CBR) stream.<p>
00069 
00070 <h2>adjustable encoding parameters</h2>
00071 
00072 <h3>input audio parameters</h3>
00073 
00074 <p>
00075 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
00076 <tr bgcolor=#cccccc>
00077         <td><b>parameter</b></td>
00078         <td><b>description</b></td>
00079 </tr>
00080 <tr valign=top>
00081 <td>sampling rate</td>
00082 <td>
00083 The sampling rate (in samples per second) of the input audio.  Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT.  Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
00084 
00085 </td>
00086 </tr>
00087 <tr valign=top>
00088 <td>channels</td>
00089 <td>
00090 
00091 The number of channels encoded in each input sample.  By default,
00092 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
00093 that the stereo relationship between the samples is taken into account
00094 when encoding.  Stereo coupling my be disabled by using <a
00095 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
00096 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
00097 
00098 </td>
00099 </tr>
00100 </table>
00101 
00102 <h3>quality and VBR modes</h3>
00103 
00104 Vorbis is natively a VBR codec; a user requests a given constant
00105 <em>quality</em> and the encoder keeps the encoding quality constant
00106 while allowing the bitrate to vary.  'Quality' modes (Variable BitRate)
00107 will always produce the most consistent encoding results as well as
00108 the highest quality for the amount of bits used.
00109 
00110 <p>
00111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
00112 <tr bgcolor=#cccccc>
00113         <td><b>parameter</b></td>
00114         <td><b>description</b></td>
00115 </tr>
00116 <tr valign=top>
00117 <td>quality</td>
00118 <td>
00119 A decimal float value requesting a desired quality.  Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency.  Quality settings 0.0 and above are intended to produce consistent results at all times.  
00120 
00121 </td>
00122 </tr>
00123 </table>
00124 
00125 <a name="BBR">
00126 <h3>managed bitrate modes</h3>
00127 
00128 Although the Vorbis codec is natively VBR, libvorbis includes
00129 infrastructure for 'managing' the bitrate of streams by setting
00130 minimum and maximum usage constraints, as well as functionality for
00131 nudging a stream toward a desired average value.  These features
00132 should <em>only</em> be used when there is a requirement to limit
00133 bitrate in some way.  Although the difference is usually slight,
00134 managed bitrate modes will always produce output inferior to VBR
00135 (given equal bitrate usage). Setting overly or impossibly tight
00136 bitrate management requirements can affect output quality dramatically
00137 for the worse.<p>
00138 
00139 Beginning in libvorbis 1.1, bitrate management is implemented using a
00140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size
00141 reservoir used as a 'savings account' in encoding.  When a frame is
00142 smaller than the target rate, the unused bits go into the reservoir so
00143 that they may be used by future frames.  When a frame is larger than
00144 target bitrate, it draws 'banked' bits out of the reservoir.  Encoding
00145 is managed so that the reservoir never goes negative (when a maximum
00146 bitrate is specified) or fills beyond a fixed limit (when a minimum
00147 bitrate is specified).  An 'average bitrate' request is used as the
00148 set-point in a long-range bitrate tracker which adjusts the encoder's
00149 aggressiveness up or down depending on whether or not frames are coming
00150 in larger or smaller than the requested average point.
00151 
00152 <p>
00153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
00154 <tr bgcolor=#cccccc>
00155         <td><b>parameter</b></td>
00156         <td><b>description</b></td>
00157 </tr>
00158 <tr valign=top>
00159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
00160 per second.  If the bitrate would otherwise rise such that oversized
00161 frames would underflow the bit-reservoir by consuming banked bits,
00162 bitrate management will force the encoder to use fewer bits per frame
00163 by encoding with a more aggressive psychoacoustic model.<p> This
00164 setting is a hard limit; the bitstream will never be allowed, under
00165 any circumstances, to increase above the specified bitrate over the
00166 average period set by the reservoir; it may momentarily rise over if
00167 inspected on a granularity much finer than the average period across
00168 the reservoir.  Normally, the encoder will conserve bits gracefully by
00169 using more aggressive psychoacoustics to shrink a frame when forced
00170 to.  However, if the encoder runs out of means of gracefully shrinking
00171 a frame, it will simply take the smallest frame it can otherwise
00172 generate and truncate it to the maximum allowed length.  Note that
00173 this is not an error and although it will obviously adversely affect
00174 audio quality, a Vorbis decoder will be able to decode a truncated
00175 frame into audio.
00176 
00177 </td>
00178 </tr>
00179 
00180 <tr valign=top>
00181 <td>average bitrate</td> 
00182 
00183 <td>
00184 
00185 The average desired bitrate of a stream, set
00186 in bits per second.  Average bitrate is tracked via a reservoir like
00187 minimum and maximum bitrate, however the averaging reservior does not
00188 impose a hard limit; it is used to nudge the bitrate toward the
00189 desired average by slowly adjusting the psychoacoustic aggressiveness.
00190 As such, the reservoir size does not affect the average bitrate
00191 behavior.  Because this setting alone is not used to impose hard
00192 bitrate limits, the bitrate of a stream produced using only the
00193 <tt>average bitrate</tt> constraint will track the average over time
00194 but not necessarily adhere strictly to that average for any given
00195 period.  Should a strict localized average be required, <tt>average
00196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
00197 <tt>maximum bitrate</tt>.
00198 </td>
00199 
00200 </tr>
00201 
00202 <tr valign=top>
00203 <td>minimum bitrate</td>
00204 <td> 
00205  The minimum allowed bitrate, set in bits per second.  If
00206 the bitrate would otherwise fall such that undersized frames would
00207 overflow the bit-reservoir with unused bits, bitrate management will
00208 force the encoder to use more bits per frame by encoding with a less
00209 aggressive psychoacoustic model.<p> This setting is a hard limit; the
00210 bitstream will never be allowed, under any circumstances, to drop
00211 below the specified bitrate over the average period set by the
00212 reservoir; it may momentarily fall under if inspected on a granularity
00213 much finer than the average period across the reservoir.  Normally,
00214 the encoder will fill out undersided frames with additional useful
00215 coding information by increasing the perceived quality of the stream.
00216 If the encoder runs out of useful ways to consume more bits, it will
00217 pad frames out with zeroes.
00218 </td>
00219 </tr>
00220 
00221 <tr valign=top>
00222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate
00223 tracking reservoir, set in bits.  The reservoir is used as a 'bit
00224 bank' to average out localized surges and dips in bitrate while
00225 providing predictable, guaranteed buffering behavior for streams to be
00226 used in situations with constrained transport bandwidth.  The default
00227 setting is two seconds of average bitrate.<p>
00228 
00229 When a single frame is larger than the maximum allowed overall
00230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
00231 reservoir contains insufficient bits to cover the defecit, the encoder
00232 must find some way to reduce the frame size. <p>
00233 
00234 When a frame is under the minimum limit, the surplus bits are placed
00235 into the reservoir, banking them for future use.  If the reservoir is
00236 already full of banked bits, the encoder is forced to find some way to
00237 make the frame larger.<p>
00238 
00239 If the frame size is between the minimum and maximum rates (thus
00240 implying the minimum and maximum allowed rates are different), the
00241 reservoir gravitates toward a fill point configured by the
00242 <tt>reservoir bias</tt> setting described next.  If the reservoir is
00243 fuller than the fill point (a 'surplus of surplus'), the encoder will
00244 consume a number bits from the reservoir equal to the number of the
00245 bits by which the frame exceeds minimum size.  If the reservoir is
00246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned
00247 to the reservoir equaling the current frame's number of bits under the
00248 maximum frame size.  The idea of the fill point is to buffer against
00249 both underruns and overruns, by trying to hold the reservoir to a
00250 middle course.
00251 </td>
00252 </tr>
00253 
00254 <tr valign=top>
00255 <td>reservoir bias</td>
00256 
00257 <td>
00258 
00259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
00260 management toward smoothing bitrate spikes (0.0) or bitrate peaks
00261 (1.0); the default setting is 0.1.<p>
00262 
00263 Using settings toward 0.0 causes the bitrate manager to hoard bits in
00264 the bit reservoir such that there is a large pool of banked surplus to
00265 draw upon during short spikes in bitrate.  As a result, the encoder
00266 will react less aggressively and less drastically to curtail framesize
00267 during brief surges in bitrate.<p>
00268 
00269 Using settings toward 1.0 causes the bitrate manager to empty the bit
00270 reservoir such that there is a large buffer available to store surplus
00271 bits during sudden drops in bitrate.  As a result, the encoder will
00272 react less aggressively and less drastically to support minimum frame
00273 sizes during drops in bitrate and will tend not to store any extra
00274 bits in the reservoir for future bitrate spikes.<p>
00275 
00276 </td>
00277 </tr>
00278 
00279 <tr valign=top>
00280 <td>average track damping</td>
00281 <td> 
00282 
00283 A decimal value, in seconds, that controls how quickly the average
00284 bitrate tracker is allowed to slew from enforcing minimum frame sizes
00285 to maximum framesizes and vice versa.  Default value is 1.5
00286 seconds.<p>
00287 
00288 When the 'average bitrate' setting is in use, the average bitrate
00289 tracker uses an unbounded reservoir to track overall bitrate-to-date
00290 in the stream.  When bitrates are too low, the tracker will try to
00291 nudge bitrates up and when the bitrate is too high, nudge it down.
00292 The damping value regulates the maximum strength of the nudge; it
00293 describes, in seconds, how quickly the tracker may transition from an
00294 extreme nudge in one direction to an extreme nudge in the other.<p>
00295 
00296 </td>
00297 </tr>
00298 
00299 </table>
00300 
00301 <h3>encoding model adjustments</h3>
00302 
00303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
00304 a generalized interface for making encoding setup adjustments to the
00305 basic high-level setup provided by <a
00306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
00307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
00308 In reality, these two calls use <a
00309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
00310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
00311 most of the parameters set by other calls.<p>
00312 
00313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
00314 adjust the following additional parameters not described elsewhere:
00315 
00316 <p>
00317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
00318 <tr bgcolor=#cccccc>
00319         <td><b>parameter</b></td>
00320         <td><b>description</b></td>
00321 </tr>
00322 <tr valign=top>
00323 <td>management mode</td> <td> Configures whether or not bitrate
00324 management is in use or not.  Normally, this value is set implicitly
00325 during encoding setup; however, the supported means of selecting a
00326 quality mode by bitrate (that is, requesting a true VBR stream, but
00327 doing so by asking for an approximate bitrate) is to use <a
00328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
00329 and then to explicitly turn off bitrate management by calling <a
00330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
00331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
00332 </td>
00333 </tr>
00334 
00335 <tr valign=top>
00336 <td>coupling</td> <td> Stereo encoding (and in the future, surround
00337 encodings) are normally encoded assuming the channels form a stereo
00338 image and that lossy-stereo modelling is appropriate; this is called
00339 'coupling'.  Stereo coupling may be explicitly enabled or disabled.
00340 </td>
00341 </tr>
00342 <tr valign=top>
00343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
00344 this may be used to conserve a few bits in high-rate audio that has
00345 limited bandwidth, or in testing of the encoder's acoustic model.  The
00346 encoder is generally already configured with ideal lowpasses (if any
00347 at all) for given modes; use of this parameter is strongly discouraged
00348 if the point is to try to 'improve' a given encoding mode for general
00349 encoding.
00350 </td>
00351 </tr>
00352 
00353 <tr valign=top>
00354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis
00355 attempts to compromise between preventing wide bitrate swings and
00356 high-resolution impulse coding (which is required for the crispest
00357 possible attacks, but also requires a relatively large momentary
00358 bitrate increase).  This parameter allows an application to tune the
00359 compromise or eliminate it; A value of 0.0 indicates normal behavior
00360 while a value of -15.0 requests maximum possible impulse
00361 resolution.</td>
00362 </tr>
00363 
00364 </table>
00365 
00366 
00367 <br><br>
00368 <hr noshade>
00369 <table border=0 width=100%>
00370 <tr valign=top>
00371 <td><p class=tiny>copyright &copy; 2004 Vorbis team</p></td>
00372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td>
00373 </tr><tr>
00374 <td><p class=tiny>libvorbisenc documentation</p></td>
00375 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
00376 </tr>
00377 </table>
00378 
00379 </body>
00380 
00381 </html>
00382 

Generated by  doxygen 1.6.2