00001 <html> 00002 00003 <head> 00004 <title>libvorbisenc - API Overview</title> 00005 <link rel=stylesheet href="style.css" type="text/css"> 00006 </head> 00007 00008 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff"> 00009 <table border=0 width=100%> 00010 <tr> 00011 <td><p class=tiny>libvorbisenc documentation</p></td> 00012 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> 00013 </tr> 00014 </table> 00015 00016 <h1>Libvorbisenc API Overview</h1> 00017 00018 <p>Libvorbisenc is an encoding convenience library intended to 00019 encapsulate the elaborate setup that libvorbis requires for encoding. 00020 Libvorbisenc gives easy access to all high-level adjustments an 00021 application may require when encoding and also exposes some low-level 00022 tuning parameters to allow applications to make detailed adjustments 00023 to the encoding process. <p> 00024 00025 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h". 00026 00027 <em>Note: libvorbis and libvorbisenc always 00028 encode in a single pass. Thus, all possible encoding setups will work 00029 properly with live input and produce streams that decode properly when 00030 streamed. See the subsection titled <a href="#BBR">"managed bitrate 00031 modes"</a> for details on setting limits on bitrate usage when Vorbis 00032 streams are used in a limited-bandwidth environment.</em> 00033 00034 <h2>workflow</h2> 00035 00036 <p>Libvorbisenc is used only during encoder setup; its function 00037 is to automate initialization of a multitude of settings in a 00038 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference 00039 during the encoding process. Libvorbisenc plays no part in the 00040 encoding process after setup. 00041 00042 <p>Encode setup using libvorbisenc consists of three steps: 00043 00044 <ol> 00045 <li>high-level initialization of a <tt>vorbis_info</tt> structure by 00046 calling one of <a 00047 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a 00048 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> 00049 with the basic input audio parameters (rate and channels) and the 00050 basic desired encoded audio output parameters (VBR quality or ABR/CBR 00051 bitrate)<p> 00052 00053 <li>optional adjustment of the basic setup defaults using <a 00054 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p> 00055 00056 <li>calling <a 00057 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to 00058 finalize the high-level setup into the detailed low-level reference 00059 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt> 00060 structure is then ready to use for encoding by libvorbis.<p> 00061 00062 </ol> 00063 00064 These three steps can be collapsed into a single call by using <a 00065 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a 00066 quality-based VBR stream or <a 00067 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed 00068 bitrate (ABR or CBR) stream.<p> 00069 00070 <h2>adjustable encoding parameters</h2> 00071 00072 <h3>input audio parameters</h3> 00073 00074 <p> 00075 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> 00076 <tr bgcolor=#cccccc> 00077 <td><b>parameter</b></td> 00078 <td><b>description</b></td> 00079 </tr> 00080 <tr valign=top> 00081 <td>sampling rate</td> 00082 <td> 00083 The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample. 00084 00085 </td> 00086 </tr> 00087 <tr valign=top> 00088 <td>channels</td> 00089 <td> 00090 00091 The number of channels encoded in each input sample. By default, 00092 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such 00093 that the stereo relationship between the samples is taken into account 00094 when encoding. Stereo coupling my be disabled by using <a 00095 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a 00096 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>. 00097 00098 </td> 00099 </tr> 00100 </table> 00101 00102 <h3>quality and VBR modes</h3> 00103 00104 Vorbis is natively a VBR codec; a user requests a given constant 00105 <em>quality</em> and the encoder keeps the encoding quality constant 00106 while allowing the bitrate to vary. 'Quality' modes (Variable BitRate) 00107 will always produce the most consistent encoding results as well as 00108 the highest quality for the amount of bits used. 00109 00110 <p> 00111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> 00112 <tr bgcolor=#cccccc> 00113 <td><b>parameter</b></td> 00114 <td><b>description</b></td> 00115 </tr> 00116 <tr valign=top> 00117 <td>quality</td> 00118 <td> 00119 A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times. 00120 00121 </td> 00122 </tr> 00123 </table> 00124 00125 <a name="BBR"> 00126 <h3>managed bitrate modes</h3> 00127 00128 Although the Vorbis codec is natively VBR, libvorbis includes 00129 infrastructure for 'managing' the bitrate of streams by setting 00130 minimum and maximum usage constraints, as well as functionality for 00131 nudging a stream toward a desired average value. These features 00132 should <em>only</em> be used when there is a requirement to limit 00133 bitrate in some way. Although the difference is usually slight, 00134 managed bitrate modes will always produce output inferior to VBR 00135 (given equal bitrate usage). Setting overly or impossibly tight 00136 bitrate management requirements can affect output quality dramatically 00137 for the worse.<p> 00138 00139 Beginning in libvorbis 1.1, bitrate management is implemented using a 00140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size 00141 reservoir used as a 'savings account' in encoding. When a frame is 00142 smaller than the target rate, the unused bits go into the reservoir so 00143 that they may be used by future frames. When a frame is larger than 00144 target bitrate, it draws 'banked' bits out of the reservoir. Encoding 00145 is managed so that the reservoir never goes negative (when a maximum 00146 bitrate is specified) or fills beyond a fixed limit (when a minimum 00147 bitrate is specified). An 'average bitrate' request is used as the 00148 set-point in a long-range bitrate tracker which adjusts the encoder's 00149 aggressiveness up or down depending on whether or not frames are coming 00150 in larger or smaller than the requested average point. 00151 00152 <p> 00153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> 00154 <tr bgcolor=#cccccc> 00155 <td><b>parameter</b></td> 00156 <td><b>description</b></td> 00157 </tr> 00158 <tr valign=top> 00159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits 00160 per second. If the bitrate would otherwise rise such that oversized 00161 frames would underflow the bit-reservoir by consuming banked bits, 00162 bitrate management will force the encoder to use fewer bits per frame 00163 by encoding with a more aggressive psychoacoustic model.<p> This 00164 setting is a hard limit; the bitstream will never be allowed, under 00165 any circumstances, to increase above the specified bitrate over the 00166 average period set by the reservoir; it may momentarily rise over if 00167 inspected on a granularity much finer than the average period across 00168 the reservoir. Normally, the encoder will conserve bits gracefully by 00169 using more aggressive psychoacoustics to shrink a frame when forced 00170 to. However, if the encoder runs out of means of gracefully shrinking 00171 a frame, it will simply take the smallest frame it can otherwise 00172 generate and truncate it to the maximum allowed length. Note that 00173 this is not an error and although it will obviously adversely affect 00174 audio quality, a Vorbis decoder will be able to decode a truncated 00175 frame into audio. 00176 00177 </td> 00178 </tr> 00179 00180 <tr valign=top> 00181 <td>average bitrate</td> 00182 00183 <td> 00184 00185 The average desired bitrate of a stream, set 00186 in bits per second. Average bitrate is tracked via a reservoir like 00187 minimum and maximum bitrate, however the averaging reservior does not 00188 impose a hard limit; it is used to nudge the bitrate toward the 00189 desired average by slowly adjusting the psychoacoustic aggressiveness. 00190 As such, the reservoir size does not affect the average bitrate 00191 behavior. Because this setting alone is not used to impose hard 00192 bitrate limits, the bitrate of a stream produced using only the 00193 <tt>average bitrate</tt> constraint will track the average over time 00194 but not necessarily adhere strictly to that average for any given 00195 period. Should a strict localized average be required, <tt>average 00196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and 00197 <tt>maximum bitrate</tt>. 00198 </td> 00199 00200 </tr> 00201 00202 <tr valign=top> 00203 <td>minimum bitrate</td> 00204 <td> 00205 The minimum allowed bitrate, set in bits per second. If 00206 the bitrate would otherwise fall such that undersized frames would 00207 overflow the bit-reservoir with unused bits, bitrate management will 00208 force the encoder to use more bits per frame by encoding with a less 00209 aggressive psychoacoustic model.<p> This setting is a hard limit; the 00210 bitstream will never be allowed, under any circumstances, to drop 00211 below the specified bitrate over the average period set by the 00212 reservoir; it may momentarily fall under if inspected on a granularity 00213 much finer than the average period across the reservoir. Normally, 00214 the encoder will fill out undersided frames with additional useful 00215 coding information by increasing the perceived quality of the stream. 00216 If the encoder runs out of useful ways to consume more bits, it will 00217 pad frames out with zeroes. 00218 </td> 00219 </tr> 00220 00221 <tr valign=top> 00222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate 00223 tracking reservoir, set in bits. The reservoir is used as a 'bit 00224 bank' to average out localized surges and dips in bitrate while 00225 providing predictable, guaranteed buffering behavior for streams to be 00226 used in situations with constrained transport bandwidth. The default 00227 setting is two seconds of average bitrate.<p> 00228 00229 When a single frame is larger than the maximum allowed overall 00230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the 00231 reservoir contains insufficient bits to cover the defecit, the encoder 00232 must find some way to reduce the frame size. <p> 00233 00234 When a frame is under the minimum limit, the surplus bits are placed 00235 into the reservoir, banking them for future use. If the reservoir is 00236 already full of banked bits, the encoder is forced to find some way to 00237 make the frame larger.<p> 00238 00239 If the frame size is between the minimum and maximum rates (thus 00240 implying the minimum and maximum allowed rates are different), the 00241 reservoir gravitates toward a fill point configured by the 00242 <tt>reservoir bias</tt> setting described next. If the reservoir is 00243 fuller than the fill point (a 'surplus of surplus'), the encoder will 00244 consume a number bits from the reservoir equal to the number of the 00245 bits by which the frame exceeds minimum size. If the reservoir is 00246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned 00247 to the reservoir equaling the current frame's number of bits under the 00248 maximum frame size. The idea of the fill point is to buffer against 00249 both underruns and overruns, by trying to hold the reservoir to a 00250 middle course. 00251 </td> 00252 </tr> 00253 00254 <tr valign=top> 00255 <td>reservoir bias</td> 00256 00257 <td> 00258 00259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate 00260 management toward smoothing bitrate spikes (0.0) or bitrate peaks 00261 (1.0); the default setting is 0.1.<p> 00262 00263 Using settings toward 0.0 causes the bitrate manager to hoard bits in 00264 the bit reservoir such that there is a large pool of banked surplus to 00265 draw upon during short spikes in bitrate. As a result, the encoder 00266 will react less aggressively and less drastically to curtail framesize 00267 during brief surges in bitrate.<p> 00268 00269 Using settings toward 1.0 causes the bitrate manager to empty the bit 00270 reservoir such that there is a large buffer available to store surplus 00271 bits during sudden drops in bitrate. As a result, the encoder will 00272 react less aggressively and less drastically to support minimum frame 00273 sizes during drops in bitrate and will tend not to store any extra 00274 bits in the reservoir for future bitrate spikes.<p> 00275 00276 </td> 00277 </tr> 00278 00279 <tr valign=top> 00280 <td>average track damping</td> 00281 <td> 00282 00283 A decimal value, in seconds, that controls how quickly the average 00284 bitrate tracker is allowed to slew from enforcing minimum frame sizes 00285 to maximum framesizes and vice versa. Default value is 1.5 00286 seconds.<p> 00287 00288 When the 'average bitrate' setting is in use, the average bitrate 00289 tracker uses an unbounded reservoir to track overall bitrate-to-date 00290 in the stream. When bitrates are too low, the tracker will try to 00291 nudge bitrates up and when the bitrate is too high, nudge it down. 00292 The damping value regulates the maximum strength of the nudge; it 00293 describes, in seconds, how quickly the tracker may transition from an 00294 extreme nudge in one direction to an extreme nudge in the other.<p> 00295 00296 </td> 00297 </tr> 00298 00299 </table> 00300 00301 <h3>encoding model adjustments</h3> 00302 00303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides 00304 a generalized interface for making encoding setup adjustments to the 00305 basic high-level setup provided by <a 00306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a 00307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>. 00308 In reality, these two calls use <a 00309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a 00310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust 00311 most of the parameters set by other calls.<p> 00312 00313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can 00314 adjust the following additional parameters not described elsewhere: 00315 00316 <p> 00317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> 00318 <tr bgcolor=#cccccc> 00319 <td><b>parameter</b></td> 00320 <td><b>description</b></td> 00321 </tr> 00322 <tr valign=top> 00323 <td>management mode</td> <td> Configures whether or not bitrate 00324 management is in use or not. Normally, this value is set implicitly 00325 during encoding setup; however, the supported means of selecting a 00326 quality mode by bitrate (that is, requesting a true VBR stream, but 00327 doing so by asking for an approximate bitrate) is to use <a 00328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> 00329 and then to explicitly turn off bitrate management by calling <a 00330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a 00331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a> 00332 </td> 00333 </tr> 00334 00335 <tr valign=top> 00336 <td>coupling</td> <td> Stereo encoding (and in the future, surround 00337 encodings) are normally encoded assuming the channels form a stereo 00338 image and that lossy-stereo modelling is appropriate; this is called 00339 'coupling'. Stereo coupling may be explicitly enabled or disabled. 00340 </td> 00341 </tr> 00342 <tr valign=top> 00343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode; 00344 this may be used to conserve a few bits in high-rate audio that has 00345 limited bandwidth, or in testing of the encoder's acoustic model. The 00346 encoder is generally already configured with ideal lowpasses (if any 00347 at all) for given modes; use of this parameter is strongly discouraged 00348 if the point is to try to 'improve' a given encoding mode for general 00349 encoding. 00350 </td> 00351 </tr> 00352 00353 <tr valign=top> 00354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis 00355 attempts to compromise between preventing wide bitrate swings and 00356 high-resolution impulse coding (which is required for the crispest 00357 possible attacks, but also requires a relatively large momentary 00358 bitrate increase). This parameter allows an application to tune the 00359 compromise or eliminate it; A value of 0.0 indicates normal behavior 00360 while a value of -15.0 requests maximum possible impulse 00361 resolution.</td> 00362 </tr> 00363 00364 </table> 00365 00366 00367 <br><br> 00368 <hr noshade> 00369 <table border=0 width=100%> 00370 <tr valign=top> 00371 <td><p class=tiny>copyright © 2004 Vorbis team</p></td> 00372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td> 00373 </tr><tr> 00374 <td><p class=tiny>libvorbisenc documentation</p></td> 00375 <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> 00376 </tr> 00377 </table> 00378 00379 </body> 00380 00381 </html> 00382