1
2/* -----------------------------------------------------------------------------------------------------------
3Software License for The Fraunhofer FDK AAC Codec Library for Android
4
5� Copyright  1995 - 2013 Fraunhofer-Gesellschaft zur F�rderung der angewandten Forschung e.V.
6  All rights reserved.
7
8 1.    INTRODUCTION
9The Fraunhofer FDK AAC Codec Library for Android ("FDK AAC Codec") is software that implements
10the MPEG Advanced Audio Coding ("AAC") encoding and decoding scheme for digital audio.
11This FDK AAC Codec software is intended to be used on a wide variety of Android devices.
12
13AAC's HE-AAC and HE-AAC v2 versions are regarded as today's most efficient general perceptual
14audio codecs. AAC-ELD is considered the best-performing full-bandwidth communications codec by
15independent studies and is widely deployed. AAC has been standardized by ISO and IEC as part
16of the MPEG specifications.
17
18Patent licenses for necessary patent claims for the FDK AAC Codec (including those of Fraunhofer)
19may be obtained through Via Licensing (www.vialicensing.com) or through the respective patent owners
20individually for the purpose of encoding or decoding bit streams in products that are compliant with
21the ISO/IEC MPEG audio standards. Please note that most manufacturers of Android devices already license
22these patent claims through Via Licensing or directly from the patent owners, and therefore FDK AAC Codec
23software may already be covered under those patent licenses when it is used for those licensed purposes only.
24
25Commercially-licensed AAC software libraries, including floating-point versions with enhanced sound quality,
26are also available from Fraunhofer. Users are encouraged to check the Fraunhofer website for additional
27applications information and documentation.
28
292.    COPYRIGHT LICENSE
30
31Redistribution and use in source and binary forms, with or without modification, are permitted without
32payment of copyright license fees provided that you satisfy the following conditions:
33
34You must retain the complete text of this software license in redistributions of the FDK AAC Codec or
35your modifications thereto in source code form.
36
37You must retain the complete text of this software license in the documentation and/or other materials
38provided with redistributions of the FDK AAC Codec or your modifications thereto in binary form.
39You must make available free of charge copies of the complete source code of the FDK AAC Codec and your
40modifications thereto to recipients of copies in binary form.
41
42The name of Fraunhofer may not be used to endorse or promote products derived from this library without
43prior written permission.
44
45You may not charge copyright license fees for anyone to use, copy or distribute the FDK AAC Codec
46software or your modifications thereto.
47
48Your modified versions of the FDK AAC Codec must carry prominent notices stating that you changed the software
49and the date of any change. For modified versions of the FDK AAC Codec, the term
50"Fraunhofer FDK AAC Codec Library for Android" must be replaced by the term
51"Third-Party Modified Version of the Fraunhofer FDK AAC Codec Library for Android."
52
533.    NO PATENT LICENSE
54
55NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without limitation the patents of Fraunhofer,
56ARE GRANTED BY THIS SOFTWARE LICENSE. Fraunhofer provides no warranty of patent non-infringement with
57respect to this software.
58
59You may use this FDK AAC Codec software or modifications thereto only for purposes that are authorized
60by appropriate patent licenses.
61
624.    DISCLAIMER
63
64This FDK AAC Codec software is provided by Fraunhofer on behalf of the copyright holders and contributors
65"AS IS" and WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, including but not limited to the implied warranties
66of merchantability and fitness for a particular purpose. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
67CONTRIBUTORS BE LIABLE for any direct, indirect, incidental, special, exemplary, or consequential damages,
68including but not limited to procurement of substitute goods or services; loss of use, data, or profits,
69or business interruption, however caused and on any theory of liability, whether in contract, strict
70liability, or tort (including negligence), arising in any way out of the use of this software, even if
71advised of the possibility of such damage.
72
735.    CONTACT INFORMATION
74
75Fraunhofer Institute for Integrated Circuits IIS
76Attention: Audio and Multimedia Departments - FDK AAC LL
77Am Wolfsmantel 33
7891058 Erlangen, Germany
79
80www.iis.fraunhofer.de/amm
81amm-info@iis.fraunhofer.de
82----------------------------------------------------------------------------------------------------------- */
83
84/**************************** MPEG-4 HE-AAC Encoder **************************
85
86  Initial author:       M. Lohwasser
87******************************************************************************/
88
89/**
90 * \file   aacenc_lib.h
91 * \brief  FDK AAC Encoder library interface header file.
92 *
93\mainpage  Introduction
94
95\section Scope
96
97This document describes the high-level interface and usage of the ISO/MPEG-2/4 AAC Encoder
98library developed by the Fraunhofer Institute for Integrated Circuits (IIS).
99
100The library implements encoding on the basis of the MPEG-2 and MPEG-4 AAC Low-Complexity
101standard, and depending on the library's configuration, MPEG-4 High-Efficiency AAC v2 and/or AAC-ELD standard.
102
103All references to SBR (Spectral Band Replication) are only applicable to HE-AAC or AAC-ELD versions
104of the library. All references to PS (Parametric Stereo) are only applicable to HE-AAC v2
105versions of the library.
106
107\section encBasics Encoder Basics
108
109This document can only give a rough overview about the ISO/MPEG-2 and ISO/MPEG-4 AAC audio coding
110standard. To understand all the terms in this document, you are encouraged to read the following documents.
111
112- ISO/IEC 13818-7 (MPEG-2 AAC), which defines the syntax of MPEG-2 AAC audio bitstreams.
113- ISO/IEC 14496-3 (MPEG-4 AAC, subparts 1 and 4), which defines the syntax of MPEG-4 AAC audio bitstreams.
114- Lutzky, Schuller, Gayer, Krämer, Wabnik, "A guideline to audio codec delay", 116th AES Convention, May 8, 2004
115
116MPEG Advanced Audio Coding is based on a time-to-frequency mapping of the signal. The signal is
117partitioned into overlapping portions and transformed into frequency domain. The spectral components
118are then quantized and coded. \n
119An MPEG-2 or MPEG-4 AAC audio bitstream is composed of frames. Contrary to MPEG-1/2 Layer-3 (mp3), the
120length of individual frames is not restricted to a fixed number of bytes, but can take on any length
121between 1 and 768 bytes.
122
123
124\page LIBUSE Library Usage
125
126\section InterfaceDescription API Files
127
128All API header files are located in the folder /include of the release package. All header files
129are provided for usage in C/C++ programs. The AAC encoder library API functions are located at
130aacenc_lib.h.
131
132In binary releases the encoder core resides in statically linkable libraries called for example
133libAACenc.a/libFDK.a (LINUX) or FDK_fastaaclib.lib (MS Visual C++) for the plain AAC-LC core encoder
134and libSBRenc.a (LINUX) or FDK_sbrEncLib.lib (MS Visual C++) for the SBR (Spectral Band
135Replication) and PS (Parametric Stereo) modules.
136
137\section CallingSequence Calling Sequence
138
139For encoding of ISO/MPEG-2/4 AAC bitstreams the following sequence is mandatory. Input read and output
140write functions as well as the corresponding open and close functions are left out, since they may be
141implemented differently according to the user's specific requirements. The example implementation in
142main.cpp uses file-based input/output.
143
144-# Call aacEncOpen() to allocate encoder instance with required \ref encOpen "configuration".\n
145\dontinclude main.cpp
146\skipline hAacEncoder =
147\skipline aacEncOpen
148-# Call aacEncoder_SetParam() for each parameter to be set. AOT, samplingrate, channelMode, bitrate and transport type are \ref encParams "mandatory".
149\code
150    ErrorStatus = aacEncoder_SetParam(hAacEncoder, parameter, value);
151\endcode
152-# Call aacEncEncode() with NULL parameters to \ref encReconf "initialize" encoder instance with present parameter set.
153\skipline aacEncEncode
154-# Call aacEncInfo() to retrieve a configuration data block to be transmitted out of band. This is required when using RFC3640 or RFC3016 like transport.
155\dontinclude main.cpp
156\skipline encInfo
157\skipline aacEncInfo
158-# Encode input audio data in loop.
159\skip Encode as long as
160\skipline do
161\until {
162Feed \ref feedInBuf "input buffer" with new audio data and provide input/output \ref bufDes "arguments" to aacEncEncode().
163\skipline aacEncEncode
164\until ;
165Write \ref writeOutData "output data" to file or audio device. \skipline while
166-# Call aacEncClose() and destroy encoder instance.
167\skipline aacEncClose
168
169\section encOpen Encoder Instance Allocation
170
171The assignment of the aacEncOpen() function is very flexible and can be used in the following way.
172- If the amount of memory consumption is not an issue, the encoder instance can be allocated
173for the maximum number of possible audio channels (for example 6 or 8) with the full functional range supported by the library.
174This is the default open procedure for the AAC encoder if memory consumption does not need to be minimized.
175\code aacEncOpen(&hAacEncoder,0,0) \endcode
176- If the required MPEG-4 AOTs do not call for the full functional range of the library, encoder modules can be allocated selectively.
177\verbatim
178------------------------------------------------------
179 AAC | SBR |  PS | MD |         FLAGS         | value
180-----+-----+-----+----+-----------------------+-------
181  X  |  -  |  -  |  - | (0x01)                |  0x01
182  X  |  X  |  -  |  - | (0x01|0x02)           |  0x03
183  X  |  X  |  X  |  - | (0x01|0x02|0x04)      |  0x07
184  X  |  -  |  -  |  X | (0x01          |0x10) |  0x11
185  X  |  X  |  -  |  X | (0x01|0x02     |0x10) |  0x13
186  X  |  X  |  X  |  X | (0x01|0x02|0x04|0x10) |  0x17
187------------------------------------------------------
188 - AAC: Allocate AAC Core Encoder module.
189 - SBR: Allocate Spectral Band Replication module.
190 - PS: Allocate Parametric Stereo module.
191 - MD: Allocate Meta Data module within AAC encoder.
192\endverbatim
193\code aacEncOpen(&hAacEncoder,value,0) \endcode
194- Specifying the maximum number of channels to be supported in the encoder instance can be done as follows.
195 - For example allocate an encoder instance which supports 2 channels for all supported AOTs.
196   The library itself may be capable of encoding up to 6 or 8 channels but in this example only 2 channel encoding is required and thus only buffers for 2 channels are allocated to save data memory.
197\code aacEncOpen(&hAacEncoder,0,2) \endcode
198 - Additionally the maximum number of supported channels in the SBR module can be denoted separately.\n
199   In this example the encoder instance provides a maximum of 6 channels out of which up to 2 channels support SBR.
200   This encoder instance can produce for example 5.1 channel AAC-LC streams or stereo HE-AAC (v2) streams.
201   HE-AAC 5.1 multi channel is not possible since only 2 out of 6 channels support SBR, which saves data memory.
202\code aacEncOpen(&hAacEncoder,0,6|(2<<8)) \endcode
203\n
204
205\section bufDes Input/Output Arguments
206
207\subsection allocIOBufs Provide Buffer Descriptors
208In the present encoder API, the input and output buffers are described with \ref AACENC_BufDesc "buffer descriptors". This mechanism allows a flexible handling
209of input and output buffers without impact to the actual encoding call. Optional buffers are necessary e.g. for ancillary data, meta data input or additional output
210buffers describing superframing data in DAB+ or DRM+.\n
211At least one input buffer for audio input data and one output buffer for bitstream data must be allocated. The input buffer size can be a user defined multiple
212of the number of input channels. PCM input data will be copied from the user defined PCM buffer to an internal input buffer and so input data can be less than one AAC audio frame.
213The output buffer size should be 6144 bits per channel excluding the LFE channel.
214If the output data does not fit into the provided buffer, an AACENC_ERROR will be returned by aacEncEncode().
215\dontinclude main.cpp
216\skipline inputBuffer
217\until outputBuffer
218All input and output buffer must be clustered in input and output buffer arrays.
219\skipline inBuffer
220\until outBufferElSize
221Allocate buffer descriptors
222\skipline AACENC_BufDesc
223\skipline AACENC_BufDesc
224Initialize input buffer descriptor
225\skipline inBufDesc
226\until bufElSizes
227Initialize output buffer descriptor
228\skipline outBufDesc
229\until bufElSizes
230
231\subsection argLists Provide Input/Output Argument Lists
232The input and output arguments of an aacEncEncode() call are described in argument structures.
233\dontinclude main.cpp
234\skipline AACENC_InArgs
235\skipline AACENC_OutArgs
236
237\section feedInBuf Feed Input Buffer
238The input buffer should be handled as a modulo buffer. New audio data in the form of pulse-code-
239modulated samples (PCM) must be read from external and be fed to the input buffer depending on its
240fill level. The required sample bitrate (represented by the data type INT_PCM which is 16, 24 or 32
241bits wide) is fixed and depends on library configuration (usually 16 bit).
242
243\dontinclude main.cpp
244\skipline WAV_InputRead
245\until ;
246After the encoder's internal buffer is fed with incoming audio samples, and aacEncEncode()
247processed the new input data, update/move remaining samples in input buffer, simulating a modulo buffer:
248\skipline outargs.numInSamples>0
249\until }
250
251\section writeOutData Output Bitstream Data
252If any AAC bitstream data is available, write it to output file or device. This can be done once the
253following condition is true:
254\dontinclude main.cpp
255\skip Valid bitstream available
256\skipline outargs
257
258\skipline outBytes>0
259
260If you use file I/O then for example call mpegFileWrite_Write() from the library libMpegFileWrite
261
262\dontinclude main.cpp
263\skipline mpegFileWrite_Write
264
265\section cfgMetaData Meta Data Configuration
266
267If the present library is configured with Metadata support, it is possible to insert meta data side info into the generated
268audio bitstream while encoding.
269
270To work with meta data the encoder instance has to be \ref encOpen "allocated" with meta data support. The meta data mode must be be configured with
271the ::AACENC_METADATA_MODE parameter and aacEncoder_SetParam() function.
272\code aacEncoder_SetParam(hAacEncoder, AACENC_METADATA_MODE, 0-2); \endcode
273
274This configuration indicates how to embed meta data into bitstrem. Either no insertion, MPEG or ETSI style.
275The meta data itself must be specified within the meta data setup structure AACENC_MetaData.
276
277Changing one of the AACENC_MetaData setup parameters can be achieved from outside the library within ::IN_METADATA_SETUP input
278buffer. There is no need to supply meta data setup structure every frame. If there is no new meta setup data available, the
279encoder uses the previous setup or the default configuration in initial state.
280
281In general the audio compressor and limiter within the encoder library can be configured with the ::AACENC_METADATA_DRC_PROFILE parameter
282AACENC_MetaData::drc_profile and and AACENC_MetaData::comp_profile.
283\n
284
285\section encReconf Encoder Reconfiguration
286
287The encoder library allows reconfiguration of the encoder instance with new settings
288continuously between encoding frames. Each parameter to be changed must be set with
289a single aacEncoder_SetParam() call. The internal status of each parameter can be
290retrieved with an aacEncoder_GetParam() call.\n
291There is no stand-alone reconfiguration function available. When parameters were
292modified from outside the library, an internal control mechanism triggers the necessary
293reconfiguration process which will be applied at the beginning of the following
294aacEncEncode() call. This state can be observed from external via the AACENC_INIT_STATUS
295and aacEncoder_GetParam() function. The reconfiguration process can also be applied
296immediately when all parameters of an aacEncEncode() call are NULL with a valid encoder
297handle.\n\n
298The internal reconfiguration process can be controlled from extern with the following access.
299\code aacEncoder_SetParam(hAacEncoder, AACENC_CONTROL_STATE, AACENC_CTRLFLAGS); \endcode
300
301
302\section encParams Encoder Parametrization
303
304All parameteres listed in ::AACENC_PARAM can be modified within an encoder instance.
305
306\subsection encMandatory Mandatory Encoder Parameters
307The following parameters must be specified when the encoder instance is initialized.
308\code
309aacEncoder_SetParam(hAacEncoder, AACENC_AOT, value);
310aacEncoder_SetParam(hAacEncoder, AACENC_BITRATE, value);
311aacEncoder_SetParam(hAacEncoder, AACENC_SAMPLERATE, value);
312aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value);
313\endcode
314Beyond that is an internal auto mode which preinitizializes the ::AACENC_BITRATE parameter
315if the parameter was not set from extern. The bitrate depends on the number of effective
316channels and sampling rate and is determined as follows.
317\code
318AAC-LC (AOT_AAC_LC): 1.5 bits per sample
319HE-AAC (AOT_SBR): 0.625 bits per sample (dualrate sbr)
320HE-AAC (AOT_SBR): 1.125 bits per sample (downsampled sbr)
321HE-AAC v2 (AOT_PS): 0.5 bits per sample
322\endcode
323
324\subsection channelMode Channel Mode Configuration
325The input audio data is described with the ::AACENC_CHANNELMODE parameter in the
326aacEncoder_SetParam() call. It is not possible to use the encoder instance with a 'number of
327input channels' argument. Instead, the channelMode must be set as follows.
328\code aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value); \endcode
329The parameter is specified in ::CHANNEL_MODE and can be mapped from the number of input channels
330in the following way.
331\dontinclude main.cpp
332\skip CHANNEL_MODE chMode = MODE_INVALID;
333\until return
334
335\subsection encQual Audio Quality Considerations
336The default encoder configuration is suggested to be used. Encoder tools such as TNS and PNS
337are activated by default and are internally controlled (see \ref BEHAVIOUR_TOOLS).
338
339There is an additional quality parameter called ::AACENC_AFTERBURNER. In the default
340configuration this quality switch is deactivated because it would cause a workload
341increase which might be significant. If workload is not an issue in the application
342we recommended to activate this feature.
343\code aacEncoder_SetParam(hAacEncoder, AACENC_AFTERBURNER, 1); \endcode
344
345\subsection encELD ELD Auto Configuration Mode
346For ELD configuration a so called auto configurator is available which configures SBR and the SBR ratio by itself.
347The configurator is used when the encoder parameter ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO are not set explicitely.
348
349Based on sampling rate and chosen bitrate per channel a reasonable SBR configuration will be used.
350\verbatim
351------------------------------------------------------------
352  Sampling Rate  | Channel Bitrate |  SBR |       SBR Ratio
353-----------------+-----------------+------+-----------------
354 ]min, 16] kHz   |     min - 27999 |   on | downsampled SBR
355                 |   28000 -   max |  off |             ---
356-----------------+-----------------+------+-----------------
357 ]16 - 24] kHz   |     min - 39999 |   on | downsampled SBR
358                 |   40000 -   max |  off |             ---
359-----------------+-----------------+------+-----------------
360 ]24 - 32] kHz   |     min - 27999 |   on |    dualrate SBR
361                 |   28000 - 55999 |   on | downsampled SBR
362                 |   56000 -   max |  off |             ---
363-----------------+-----------------+------+-----------------
364 ]32 - 44.1] kHz |     min - 63999 |   on |    dualrate SBR
365                 |   64000 -   max |  off |             ---
366-----------------+-----------------+------+-----------------
367 ]44.1 - 48] kHz |     min - 63999 |   on |    dualrate SBR
368                 |   64000 - max   |  off |             ---
369------------------------------------------------------------
370\endverbatim
371
372
373\section audiochCfg Audio Channel Configuration
374The MPEG standard refers often to the so-called Channel Configuration. This Channel Configuration is used for a fixed Channel
375Mapping. The configurations 1-7 are predefined in MPEG standard and used for implicit signalling within the encoded bitstream.
376For user defined Configurations the Channel Configuration is set to 0 and the Channel Mapping must be explecitly described with an appropriate
377Program Config Element. The present Encoder implementation does not allow the user to configure this Channel Configuration from
378extern. The Encoder implementation supports fixed Channel Modes which are mapped to Channel Configuration as follow.
379\verbatim
380-------------------------------------------------------------------------------
381 ChannelMode           | ChCfg  | front_El      | side_El  | back_El  | lfe_El
382-----------------------+--------+---------------+----------+----------+--------
383MODE_1                 |      1 | SCE           |          |          |
384MODE_2                 |      2 | CPE           |          |          |
385MODE_1_2               |      3 | SCE, CPE      |          |          |
386MODE_1_2_1             |      4 | SCE, CPE      |          | SCE      |
387MODE_1_2_2             |      5 | SCE, CPE      |          | CPE      |
388MODE_1_2_2_1           |      6 | SCE, CPE      |          | CPE      | LFE
389MODE_1_2_2_2_1         |      7 | SCE, CPE, CPE |          | CPE      | LFE
390-----------------------+--------+---------------+----------+----------+--------
391MODE_7_1_REAR_SURROUND |      0 | SCE, CPE      |          | CPE, CPE | LFE
392MODE_7_1_FRONT_CENTER  |      0 | SCE, CPE, CPE |          | CPE      | LFE
393-------------------------------------------------------------------------------
394 - SCE: Single Channel Element.
395 - CPE: Channel Pair.
396 - SCE: Low Frequency Element.
397\endverbatim
398
399Moreover, the Table describes all fixed Channel Elements for each Channel Mode which are assigned to a speaker arrangement. The
400arrangement includes front, side, back and lfe Audio Channel Elements.\n
401This mapping of Audio Channel Elements is defined in MPEG standard for Channel Config 1-7. The Channel assignment for MODE_1_1,
402MODE_2_2 and MODE_2_1 is used from the ARIB standard. All other configurations are defined as suggested in MPEG.\n
403In case of Channel Config 0 or writing matrix mixdown coefficients, the encoder enables the writing of Program Config Element
404itself as described in \ref encPCE. The configuration used in Program Config Element refers to the denoted Table.\n
405Beside the Channel Element assignment the Channel Modes are resposible for audio input data channel mapping. The Channel Mapping
406of the audio data depends on the selected ::AACENC_CHANNELORDER which can be MPEG or WAV like order.\n
407Following Table describes the complete channel mapping for both Channel Order configurations.
408\verbatim
409---------------------------------------------------------------------------------------
410ChannelMode            |  MPEG-Channelorder            |  WAV-Channelorder
411-----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
412MODE_1                 | 0 |   |   |   |   |   |   |   | 0 |   |   |   |   |   |   |
413MODE_2                 | 0 | 1 |   |   |   |   |   |   | 0 | 1 |   |   |   |   |   |
414MODE_1_2               | 0 | 1 | 2 |   |   |   |   |   | 2 | 0 | 1 |   |   |   |   |
415MODE_1_2_1             | 0 | 1 | 2 | 3 |   |   |   |   | 2 | 0 | 1 | 3 |   |   |   |
416MODE_1_2_2             | 0 | 1 | 2 | 3 | 4 |   |   |   | 2 | 0 | 1 | 3 | 4 |   |   |
417MODE_1_2_2_1           | 0 | 1 | 2 | 3 | 4 | 5 |   |   | 2 | 0 | 1 | 4 | 5 | 3 |   |
418MODE_1_2_2_2_1         | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 6 | 7 | 0 | 1 | 4 | 5 | 3
419-----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---
420MODE_7_1_REAR_SURROUND | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 0 | 1 | 6 | 7 | 4 | 5 | 3
421MODE_7_1_FRONT_CENTER  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 6 | 7 | 0 | 1 | 4 | 5 | 3
422---------------------------------------------------------------------------------------
423\endverbatim
424
425The denoted mapping is important for correct audio channel assignment when using MPEG or WAV ordering. The incoming audio
426channels are distributed MPEG like starting at the front channels and ending at the back channels. The distribution is used as
427described in Table concering Channel Config and fix channel elements. Please see the following example for clarification.
428
429\verbatim
430Example: MODE_1_2_2_1 - WAV-Channelorder 5.1
431------------------------------------------
432 Input Channel      | Coder Channel
433--------------------+---------------------
434 2 (front center)   | 0 (SCE channel)
435 0 (left center)    | 1 (1st of 1st CPE)
436 1 (right center)   | 2 (2nd of 1st CPE)
437 4 (left surround)  | 3 (1st of 2nd CPE)
438 5 (right surround) | 4 (2nd of 2nd CPE)
439 3 (LFE)            | 5 (LFE)
440------------------------------------------
441\endverbatim
442
443
444\section suppBitrates Supported Bitrates
445
446The FDK AAC Encoder provides a wide range of supported bitrates.
447The minimum and maximum allowed bitrate depends on the Audio Object Type. For AAC-LC the minimum
448bitrate is the bitrate that is required to write the most basic and minimal valid bitstream.
449It consists of the bitstream format header information and other static/mandatory information
450within the AAC payload. The maximum AAC framesize allowed by the MPEG-4 standard
451determines the maximum allowed bitrate for AAC-LC. For HE-AAC and HE-AAC v2 a library internal
452look-up table is used.
453
454A good working point in terms of audio quality, sampling rate and bitrate, is at 1 to 1.5
455bits/audio sample for AAC-LC, 0.625 bits/audio sample for dualrate HE-AAC, 1.125 bits/audio sample
456for downsampled HE-AAC and 0.5 bits/audio sample for HE-AAC v2.
457For example for one channel with a sampling frequency of 48 kHz, the range from
45848 kbit/s to 72 kbit/s achieves reasonable audio quality for AAC-LC.
459
460For HE-AAC and HE-AAC v2 the lowest possible audio input sampling frequency is 16 kHz because then the
461AAC-LC core encoder operates in dual rate mode at its lowest possible sampling frequency, which is 8 kHz.
462HE-AAC v2 requires stereo input audio data.
463
464Please note that in HE-AAC or HE-AAC v2 mode the encoder supports much higher bitrates than are
465appropriate for HE-AAC or HE-AAC v2. For example, at a bitrate of more than 64 kbit/s for a stereo
466audio signal at 44.1 kHz it usually makes sense to use AAC-LC, which will produce better audio
467quality at that bitrate than HE-AAC or HE-AAC v2.
468
469\section reommendedConfig Recommended Sampling Rate and Bitrate Combinations
470
471The following table provides an overview of recommended encoder configuration parameters
472which we determined by virtue of numerous listening tests.
473
474\subsection reommendedConfigLC AAC-LC, HE-AAC, HE-AACv2 in Dualrate SBR mode.
475\verbatim
476-----------------------------------------------------------------------------------
477Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No. of
478                   |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan.
479                   |                  |                [kHz]  |      Rate  |
480                   |                  |                       |     [kHz]  |
481-------------------+------------------+-----------------------+------------+-------
482AAC LC + SBR + PS  |   8000 -  11999  |         22.05, 24.00  |     24.00  |      2
483AAC LC + SBR + PS  |  12000 -  17999  |                32.00  |     32.00  |      2
484AAC LC + SBR + PS  |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  |      2
485AAC LC + SBR + PS  |  40000 -  56000  |  32.00, 44.10, 48.00  |     48.00  |      2
486-------------------+------------------+-----------------------+------------+-------
487AAC LC + SBR       |   8000 -  11999  |         22.05, 24.00  |     24.00  |      1
488AAC LC + SBR       |  12000 -  17999  |                32.00  |     32.00  |      1
489AAC LC + SBR       |  18000 -  39999  |  32.00, 44.10, 48.00  |     44.10  |      1
490AAC LC + SBR       |  40000 -  56000  |  32.00, 44.10, 48.00  |     48.00  |      1
491AAC LC + SBR       |  16000 -  27999  |  32.00, 44.10, 48.00  |     32.00  |      2
492AAC LC + SBR       |  28000 -  63999  |  32.00, 44.10, 48.00  |     44.10  |      2
493AAC LC + SBR       |  64000 - 128000  |  32.00, 44.10, 48.00  |     48.00  |      2
494-------------------+------------------+-----------------------+------------+-------
495AAC LC + SBR       |  64000 -  69999  |  32.00, 44.10, 48.00  |     32.00  | 5, 5.1
496AAC LC + SBR       |  70000 - 159999  |  32.00, 44.10, 48.00  |     44.10  | 5, 5.1
497AAC LC + SBR       | 160000 - 245999  |  32.00, 44.10, 48.00  |     48.00  |      5
498AAC LC + SBR       | 160000 - 265999  |  32.00, 44.10, 48.00  |     48.00  |    5.1
499-------------------+------------------+-----------------------+------------+-------
500AAC LC             |   8000 -  15999  | 11.025, 12.00, 16.00  |     12.00  |      1
501AAC LC             |  16000 -  23999  |                16.00  |     16.00  |      1
502AAC LC             |  24000 -  31999  |  16.00, 22.05, 24.00  |     24.00  |      1
503AAC LC             |  32000 -  55999  |                32.00  |     32.00  |      1
504AAC LC             |  56000 - 160000  |  32.00, 44.10, 48.00  |     44.10  |      1
505AAC LC             | 160001 - 288000  |                48.00  |     48.00  |      1
506-------------------+------------------+-----------------------+------------+-------
507AAC LC             |  16000 -  23999  | 11.025, 12.00, 16.00  |     12.00  |      2
508AAC LC             |  24000 -  31999  |                16.00  |     16.00  |      2
509AAC LC             |  32000 -  39999  |  16.00, 22.05, 24.00  |     22.05  |      2
510AAC LC             |  40000 -  95999  |                32.00  |     32.00  |      2
511AAC LC             |  96000 - 111999  |  32.00, 44.10, 48.00  |     32.00  |      2
512AAC LC             | 112000 - 320001  |  32.00, 44.10, 48.00  |     44.10  |      2
513AAC LC             | 320002 - 576000  |                48.00  |     48.00  |      2
514-------------------+------------------+-----------------------+------------+-------
515AAC LC             | 160000 - 239999  |                32.00  |     32.00  | 5, 5.1
516AAC LC             | 240000 - 279999  |  32.00, 44.10, 48.00  |     32.00  | 5, 5.1
517AAC LC             | 280000 - 800000  |  32.00, 44.10, 48.00  |     44.10  | 5, 5.1
518-----------------------------------------------------------------------------------
519\endverbatim \n
520
521\subsection reommendedConfigLD AAC-LD, AAC-ELD, AAC-ELD with SBR in Dualrate SBR mode.
522\verbatim
523-----------------------------------------------------------------------------------
524Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No. of
525                   |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan.
526                   |                  |                [kHz]  |      Rate  |
527                   |                  |                       |     [kHz]  |
528-------------------+------------------+-----------------------+------------+-------
529ELD + SBR          |  18000 -  24999  |        32.00 - 44.10  |     32.00  |      1
530ELD + SBR          |  25000 -  31999  |        32.00 - 48.00  |     32.00  |      1
531ELD + SBR          |  32000 -  64000  |        32.00 - 48.00  |     48.00  |      1
532-------------------+------------------+-----------------------+------------+-------
533ELD + SBR          |  32000 -  51999  |        32.00 - 48.00  |     44.10  |      2
534ELD + SBR          |  52000 - 128000  |        32.00 - 48.00  |     48.00  |      2
535-------------------+------------------+-----------------------+------------+-------
536ELD + SBR          |  72000 - 160000  |        44.10 - 48.00  |     48.00  |      3
537-------------------+------------------+-----------------------+------------+-------
538ELD + SBR          |  96000 - 212000  |        44.10 - 48.00  |     48.00  |      4
539-------------------+------------------+-----------------------+------------+-------
540ELD + SBR          | 120000 - 246000  |        44.10 - 48.00  |     48.00  |      5
541-------------------+------------------+-----------------------+------------+-------
542ELD + SBR          | 120000 - 266000  |        44.10 - 48.00  |     48.00  |    5.1
543-------------------+------------------+-----------------------+------------+-------
544LD, ELD            |  16000 -  19999  |        16.00 - 24.00  |     16.00  |      1
545LD, ELD            |  20000 -  39999  |        16.00 - 32.00  |     24.00  |      1
546LD, ELD            |  40000 -  49999  |        22.05 - 32.00  |     32.00  |      1
547LD, ELD            |  50000 -  61999  |        24.00 - 44.10  |     32.00  |      1
548LD, ELD            |  62000 -  84999  |        32.00 - 48.00  |     44.10  |      1
549LD, ELD            |  85000 - 192000  |        44.10 - 48.00  |     48.00  |      1
550-------------------+------------------+-----------------------+------------+-------
551LD, ELD            |  64000 -  75999  |        24.00 - 32.00  |     32.00  |      2
552LD, ELD            |  76000 -  97999  |        24.00 - 44.10  |     32.00  |      2
553LD, ELD            |  98000 - 135999  |        32.00 - 48.00  |     44.10  |      2
554LD, ELD            | 136000 - 384000  |        44.10 - 48.00  |     48.00  |      2
555-------------------+------------------+-----------------------+------------+-------
556LD, ELD            |  96000 - 113999  |        24.00 - 32.00  |     32.00  |      3
557LD, ELD            | 114000 - 146999  |        24.00 - 44.10  |     32.00  |      3
558LD, ELD            | 147000 - 203999  |        32.00 - 48.00  |     44.10  |      3
559LD, ELD            | 204000 - 576000  |        44.10 - 48.00  |     48.00  |      3
560-------------------+------------------+-----------------------+------------+-------
561LD, ELD            | 128000 - 151999  |        24.00 - 32.00  |     32.00  |      4
562LD, ELD            | 152000 - 195999  |        24.00 - 44.10  |     32.00  |      4
563LD, ELD            | 196000 - 271999  |        32.00 - 48.00  |     44.10  |      4
564LD, ELD            | 272000 - 768000  |        44.10 - 48.00  |     48.00  |      4
565-------------------+------------------+-----------------------+------------+-------
566LD, ELD            | 160000 - 189999  |        24.00 - 32.00  |     32.00  |      5
567LD, ELD            | 190000 - 244999  |        24.00 - 44.10  |     32.00  |      5
568LD, ELD            | 245000 - 339999  |        32.00 - 48.00  |     44.10  |      5
569LD, ELD            | 340000 - 960000  |        44.10 - 48.00  |     48.00  |      5
570-----------------------------------------------------------------------------------
571\endverbatim \n
572
573\subsection reommendedConfigELD AAC-ELD with SBR in Downsampled SBR mode.
574\verbatim
575-----------------------------------------------------------------------------------
576Audio Object Type  |  Bit Rate Range  |            Supported  | Preferred  | No. of
577                   |         [bit/s]  |       Sampling Rates  |    Sampl.  |  Chan.
578                   |                  |                [kHz]  |      Rate  |
579                   |                  |                       |     [kHz]  |
580-------------------+------------------+-----------------------+------------+-------
581ELD + SBR          |  18000 -  24999  |        16.00 - 22.05  |     22.05  |      1
582(downsampled SBR)  |  25000 -  35999  |        22.05 - 32.00  |     24.00  |      1
583                   |  36000 -  64000  |        32.00 - 48.00  |     32.00  |      1
584-----------------------------------------------------------------------------------
585\endverbatim \n
586
587
588\page ENCODERBEHAVIOUR Encoder Behaviour
589
590\section BEHAVIOUR_BANDWIDTH Bandwidth
591
592The FDK AAC encoder usually does not use the full frequency range of the input signal, but restricts the bandwidth
593according to certain library-internal settings. They can be changed in the table "bandWidthTable" in the
594file bandwidth.cpp (if available).
595
596The encoder API provides the ::AACENC_BANDWIDTH parameter to adjust the bandwidth explicitly.
597\code
598aacEncoder_SetParam(hAacEncoder, AACENC_BANDWIDTH, value);
599\endcode
600
601However it is not recommended to change these settings, because they are based on numerious listening
602tests and careful tweaks to ensure the best overall encoding quality.
603
604Theoretically a signal of for example 48 kHz can contain frequencies up to 24 kHz, but to use this full range
605in an audio encoder usually does not make sense. Usually the encoder has a very limited amount of
606bits to spend (typically 128 kbit/s for stereo 48 kHz content) and to allow full range bandwidth would
607waste a lot of these bits for frequencies the human ear is hardly able to perceive anyway, if at all. Hence it
608is wise to use the available bits for the really important frequency range and just skip the rest.
609At lower bitrates (e. g. <= 80 kbit/s for stereo 48 kHz content) the encoder will choose an even smaller
610bandwidth, because an encoded signal with smaller bandwidth and hence less artifacts sounds better than a signal
611with higher bandwidth but then more coding artefacts across all frequencies. These artefacts would occur if
612small bitrates and high bandwidths are chosen because the available bits are just not enough to encode all
613frequencies well.
614
615Unfortunately some people evaluate encoding quality based on possible bandwidth as well, but it is a two-sided
616sword considering the trade-off described above.
617
618Another aspect is workload consumption. The higher the allowed bandwidth, the more frequency lines have to be
619processed, which in turn increases the workload.
620
621\section FRAMESIZES_AND_BIT_RESERVOIR Frame Sizes & Bit Reservoir
622
623For AAC there is a difference between constant bit rate and constant frame
624length due to the so-called bit reservoir technique, which allows the encoder to use less
625bits in an AAC frame for those audio signal sections which are easy to encode,
626and then spend them at a later point in
627time for more complex audio sections. The extent to which this "bit exchange"
628is done is limited to allow for reliable and relatively low delay real time
629streaming.
630Over a longer period in time the bitrate will be constant in the AAC constant
631bitrate mode, e.g. for ISDN transmission. This means that in AAC each bitstream
632frame will in general have a different length in bytes but over time it
633will reach the target bitrate. One could also make an MPEG compliant
634AAC encoder which always produces constant length packages for each AAC frame,
635but the audio quality would be considerably worse since the bit reservoir
636technique would have to be switched off completely. A higher bit rate would have
637to be used to get the same audio quality as with an enabled bit reservoir.
638
639The maximum AAC frame length, regardless of the available bit reservoir, is defined
640as 6144 bits per channel.
641
642For mp3 by the way, the same bit reservoir technique exists, but there each bit
643stream frame has a constant length for a given bit rate (ignoring the
644padding byte). In mp3 there is a so-called "back pointer" which tells
645the decoder which bits belong to the current mp3 frame - and in general some or
646many bits have been transmitted in an earlier mp3 frame. Basically this leads to
647the same "bit exchange between mp3 frames" as in AAC but with virtually constant
648length frames.
649
650This variable frame length at "constant bit rate" is not something special
651in this Fraunhofer IIS AAC encoder. AAC has been designed in that way.
652
653\subsection BEHAVIOUR_ESTIM_AVG_FRAMESIZES Estimating Average Frame Sizes
654
655A HE-AAC v1 or v2 audio frame contains 2048 PCM samples per channel (there is
656also one mode with 1920 samples per channel but this is only for special purposes
657such as DAB+ digital radio).
658
659The number of HE-AAC frames \f$N\_FRAMES\f$ per second at 44.1 kHz is:
660
661\f[
662N\_FRAMES = 44100 / 2048 = 21.5332
663\f]
664
665At a bit rate of 8 kbps the average number of bits per frame \f$N\_BITS\_PER\_FRAME\f$ is:
666
667\f[
668N\_BITS\_PER\_FRAME = 8000 / 21.5332 = 371.52
669\f]
670
671which is about 46.44 bytes per encoded frame.
672
673At a bit rate of 32 kbps, which is quite high for single channel HE-AAC v1, it is:
674
675\f[
676N\_BITS\_PER\_FRAME = 32000 / 21.5332 = 1486
677\f]
678
679which is about 185.76 bytes per encoded frame.
680
681These bits/frame figures are average figures where each AAC frame generally has a different
682size in bytes. To calculate the same for AAC-LC just use 1024 instead of 2048 PCM samples per
683frame and channel.
684For AAC-LD/ELD it is either 480 or 512 PCM samples per frame and channel.
685
686
687\section BEHAVIOUR_TOOLS Encoder Tools
688
689The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools depending on the audio signal and
690the encoder configuration (i.e. bitrate or AOT). It is not required to configure these tools manually.
691
692PNS improves encoding quality only for certain bitrates. Therefore it makes sense to activate PNS only for
693these bitrates and save the processing power required for PNS (about 10 % of the encoder) when using other
694bitrates. This is done automatically inside the encoder library. PNS is disabled inside the encoder library if
695an MPEG-2 AOT is choosen since PNS is an MPEG-4 AAC feature.
696
697If SBR is activated, the encoder automatically deactivates PNS internally. If TNS is disabled but PNS is allowed,
698the encoder deactivates PNS calculation internally.
699
700*/
701
702#ifndef _AAC_ENC_LIB_H_
703#define _AAC_ENC_LIB_H_
704
705#include "machine_type.h"
706#include "FDK_audio.h"
707
708
709/**
710 *  AAC encoder error codes.
711 */
712typedef enum {
713    AACENC_OK                     = 0x0000,  /*!< No error happened. All fine. */
714
715    AACENC_INVALID_HANDLE         = 0x0020,  /*!< Handle passed to function call was invalid. */
716    AACENC_MEMORY_ERROR           = 0x0021,  /*!< Memory allocation failed. */
717    AACENC_UNSUPPORTED_PARAMETER  = 0x0022,  /*!< Parameter not available. */
718    AACENC_INVALID_CONFIG         = 0x0023,  /*!< Configuration not provided. */
719
720    AACENC_INIT_ERROR             = 0x0040,  /*!< General initialization error. */
721    AACENC_INIT_AAC_ERROR         = 0x0041,  /*!< AAC library initialization error. */
722    AACENC_INIT_SBR_ERROR         = 0x0042,  /*!< SBR library initialization error. */
723    AACENC_INIT_TP_ERROR          = 0x0043,  /*!< Transport library initialization error. */
724    AACENC_INIT_META_ERROR        = 0x0044,  /*!< Meta data library initialization error. */
725
726    AACENC_ENCODE_ERROR           = 0x0060,  /*!< The encoding process was interrupted by an unexpected error. */
727
728    AACENC_ENCODE_EOF             = 0x0080   /*!< End of file reached. */
729
730} AACENC_ERROR;
731
732
733/**
734 *  AAC encoder buffer descriptors identifier.
735 *  This identifier are used within buffer descriptors AACENC_BufDesc::bufferIdentifiers.
736 */
737typedef enum {
738    /* Input buffer identifier. */
739    IN_AUDIO_DATA      = 0,                  /*!< Audio input buffer, interleaved INT_PCM samples. */
740    IN_ANCILLRY_DATA   = 1,                  /*!< Ancillary data to be embedded into bitstream. */
741    IN_METADATA_SETUP  = 2,                  /*!< Setup structure for embedding meta data. */
742
743    /* Output buffer identifier. */
744    OUT_BITSTREAM_DATA = 3,                  /*!< Buffer holds bitstream output data. */
745    OUT_AU_SIZES       = 4                   /*!< Buffer contains sizes of each access unit. This information
746                                                  is necessary for superframing. */
747
748} AACENC_BufferIdentifier;
749
750
751/**
752 *  AAC encoder handle.
753 */
754typedef struct AACENCODER *HANDLE_AACENCODER;
755
756
757/**
758 *  Provides some info about the encoder configuration.
759 */
760typedef struct {
761
762    UINT                maxOutBufBytes;      /*!< Maximum number of encoder bitstream bytes within one frame.
763                                                  Size depends on maximum number of supported channels in encoder instance.
764                                                  For superframing (as used for example in DAB+), size has to be a multiple accordingly. */
765
766    UINT                maxAncBytes;         /*!< Maximum number of ancillary data bytes which can be inserted into
767                                                  bitstream within one frame. */
768
769    UINT                inBufFillLevel;      /*!< Internal input buffer fill level in samples per channel. This parameter
770                                                  will automatically be cleared if samplingrate or channel(Mode/Order) changes. */
771
772    UINT                inputChannels;       /*!< Number of input channels expected in encoding process. */
773
774    UINT                frameLength;         /*!< Amount of input audio samples consumed each frame per channel, depending
775                                                  on audio object type configuration. */
776
777    UINT                encoderDelay;        /*!< Codec delay in PCM samples/channel. Depends on framelength and AOT. Does not
778                                                  include framing delay for filling up encoder PCM input buffer. */
779
780    UCHAR               confBuf[64];         /*!< Configuration buffer in binary format as an AudioSpecificConfig
781                                                  or StreamMuxConfig according to the selected transport type. */
782
783    UINT                confSize;            /*!< Number of valid bytes in confBuf. */
784
785} AACENC_InfoStruct;
786
787
788/**
789 *  Describes the input and output buffers for an aacEncEncode() call.
790 */
791typedef struct {
792    INT                 numBufs;             /*!< Number of buffers. */
793    void              **bufs;                /*!< Pointer to vector containing buffer addresses. */
794    INT                *bufferIdentifiers;   /*!< Identifier of each buffer element. See ::AACENC_BufferIdentifier. */
795    INT                *bufSizes;            /*!< Size of each buffer in 8-bit bytes. */
796    INT                *bufElSizes;          /*!< Size of each buffer element in bytes. */
797
798} AACENC_BufDesc;
799
800
801/**
802 *  Defines the input arguments for an aacEncEncode() call.
803 */
804typedef struct {
805    INT                 numInSamples;        /*!< Number of valid input audio samples (multiple of input channels). */
806    INT                 numAncBytes;         /*!< Number of ancillary data bytes to be encoded. */
807
808} AACENC_InArgs;
809
810
811/**
812 *  Defines the output arguments for an aacEncEncode() call.
813 */
814typedef struct {
815    INT                 numOutBytes;         /*!< Number of valid bitstream bytes generated during aacEncEncode(). */
816    INT                 numInSamples;        /*!< Number of input audio samples consumed by the encoder. */
817    INT                 numAncBytes;         /*!< Number of ancillary data bytes consumed by the encoder. */
818
819} AACENC_OutArgs;
820
821
822/**
823 *  Meta Data Compression Profiles.
824 */
825typedef enum {
826    AACENC_METADATA_DRC_NONE          = 0,   /*!< None. */
827    AACENC_METADATA_DRC_FILMSTANDARD  = 1,   /*!< Film standard. */
828    AACENC_METADATA_DRC_FILMLIGHT     = 2,   /*!< Film light. */
829    AACENC_METADATA_DRC_MUSICSTANDARD = 3,   /*!< Music standard. */
830    AACENC_METADATA_DRC_MUSICLIGHT    = 4,   /*!< Music light. */
831    AACENC_METADATA_DRC_SPEECH        = 5    /*!< Speech. */
832
833} AACENC_METADATA_DRC_PROFILE;
834
835
836/**
837 *  Meta Data setup structure.
838 */
839typedef struct {
840
841  AACENC_METADATA_DRC_PROFILE drc_profile;             /*!< MPEG DRC compression profile. See ::AACENC_METADATA_DRC_PROFILE. */
842  AACENC_METADATA_DRC_PROFILE comp_profile;            /*!< ETSI heavy compression profile. See ::AACENC_METADATA_DRC_PROFILE. */
843
844  INT                         drc_TargetRefLevel;      /*!< Used to define expected level to:
845                                                            Scaled with 16 bit. x*2^16. */
846  INT                         comp_TargetRefLevel;     /*!< Adjust limiter to avoid overload.
847                                                            Scaled with 16 bit. x*2^16. */
848
849  INT                         prog_ref_level_present;  /*!< Flag, if prog_ref_level is present */
850  INT                         prog_ref_level;          /*!< Programme Reference Level = Dialogue Level:
851                                                            -31.75dB .. 0 dB ; stepsize: 0.25dB
852                                                            Scaled with 16 bit. x*2^16.*/
853
854  UCHAR                       PCE_mixdown_idx_present; /*!< Flag, if dmx-idx should be written in programme config element */
855  UCHAR                       ETSI_DmxLvl_present;     /*!< Flag, if dmx-lvl should be written in ETSI-ancData */
856
857  SCHAR                       centerMixLevel;          /*!< Center downmix level (0...7, according to table) */
858  SCHAR                       surroundMixLevel;        /*!< Surround downmix level (0...7, according to table) */
859
860  UCHAR                       dolbySurroundMode;       /*!< Indication for Dolby Surround Encoding Mode.
861                                                            - 0: Dolby Surround mode not indicated
862                                                            - 1: 2-ch audio part is not Dolby surround encoded
863                                                            - 2: 2-ch audio part is Dolby surround encoded */
864} AACENC_MetaData;
865
866
867/**
868 * AAC encoder control flags.
869 *
870 * In interaction with the ::AACENC_CONTROL_STATE parameter it is possible to get information about the internal
871 * initialization process. It is also possible to overwrite the internal state from extern when necessary.
872 */
873typedef enum
874{
875    AACENC_INIT_NONE              = 0x0000,  /*!< Do not trigger initialization. */
876    AACENC_INIT_CONFIG            = 0x0001,  /*!< Initialize all encoder modules configuration. */
877    AACENC_INIT_STATES            = 0x0002,  /*!< Reset all encoder modules history buffer. */
878    AACENC_INIT_TRANSPORT         = 0x1000,  /*!< Initialize transport lib with new parameters. */
879    AACENC_RESET_INBUFFER         = 0x2000,  /*!< Reset fill level of internal input buffer. */
880    AACENC_INIT_ALL               = 0xFFFF   /*!< Initialize all. */
881}
882AACENC_CTRLFLAGS;
883
884
885/**
886 * \brief  AAC encoder setting parameters.
887 *
888 * Use aacEncoder_SetParam() function to configure, or use aacEncoder_GetParam() function to read
889 * the internal status of the following parameters.
890 */
891typedef enum
892{
893  AACENC_AOT                      = 0x0100,  /*!< Audio object type. See ::AUDIO_OBJECT_TYPE in FDK_audio.h.
894                                                  - 2: MPEG-4 AAC Low Complexity.
895                                                  - 5: MPEG-4 AAC Low Complexity with Spectral Band Replication (HE-AAC).
896                                                  - 29: MPEG-4 AAC Low Complexity with Spectral Band Replication and Parametric Stereo (HE-AAC v2).
897                                                        This configuration can be used only with stereo input audio data.
898                                                  - 23: MPEG-4 AAC Low-Delay.
899                                                  - 39: MPEG-4 AAC Enhanced Low-Delay. Since there is no ::AUDIO_OBJECT_TYPE for ELD in
900                                                        combination with SBR defined, enable SBR explicitely by ::AACENC_SBR_MODE parameter.
901                                                  - 129: MPEG-2 AAC Low Complexity.
902                                                  - 132: MPEG-2 AAC Low Complexity with Spectral Band Replication (HE-AAC).
903                                                  - 156: MPEG-2 AAC Low Complexity with Spectral Band Replication and Parametric Stereo (HE-AAC v2).
904                                                         This configuration can be used only with stereo input audio data. */
905
906  AACENC_BITRATE                  = 0x0101,  /*!< Total encoder bitrate. This parameter is mandatory and interacts with ::AACENC_BITRATEMODE.
907                                                  - CBR: Bitrate in bits/second.
908                                                    See \ref suppBitrates for details. */
909
910  AACENC_BITRATEMODE              = 0x0102,  /*!< Bitrate mode. Configuration can be different kind of bitrate configurations:
911                                                  - 0: Constant bitrate, use bitrate according to ::AACENC_BITRATE. (default)
912                                                       Within none LD/ELD ::AUDIO_OBJECT_TYPE, the CBR mode makes use of full allowed bitreservoir.
913                                                       In contrast, at Low-Delay ::AUDIO_OBJECT_TYPE the bitreservoir is kept very small.
914                                                  - 8: LD/ELD full bitreservoir for packet based transmission. */
915
916  AACENC_SAMPLERATE               = 0x0103,  /*!< Audio input data sampling rate. Encoder supports following sampling rates:
917                                                  8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000 */
918
919  AACENC_SBR_MODE                 = 0x0104,  /*!< Configure SBR independently of the chosen Audio Object Type ::AUDIO_OBJECT_TYPE.
920                                                  This parameter is for ELD audio object type only.
921                                                  - -1: Use ELD SBR auto configurator (default).
922                                                  - 0: Disable Spectral Band Replication.
923                                                  - 1: Enable Spectral Band Replication. */
924
925  AACENC_GRANULE_LENGTH           = 0x0105,  /*!< Core encoder (AAC) audio frame length in samples:
926                                                  - 1024: Default configuration.
927                                                  - 512: Default LD/ELD configuration.
928                                                  - 480: Optional length in LD/ELD configuration. */
929
930  AACENC_CHANNELMODE              = 0x0106,  /*!< Set explicit channel mode. Channel mode must match with number of input channels.
931                                                  - 1-7 and 33,34: MPEG channel modes supported, see ::CHANNEL_MODE in FDK_audio.h. */
932
933  AACENC_CHANNELORDER             = 0x0107,  /*!< Input audio data channel ordering scheme:
934                                                  - 0: MPEG channel ordering (e. g. 5.1: C, L, R, SL, SR, LFE). (default)
935                                                  - 1: WAVE file format channel ordering (e. g. 5.1: L, R, C, LFE, SL, SR). */
936
937  AACENC_SBR_RATIO                = 0x0108,  /*!<  Controls activation of downsampled SBR. With downsampled SBR, the delay will be
938                                                   shorter. On the other hand, for achieving the same quality level, downsampled SBR
939                                                   needs more bits than dual-rate SBR.
940                                                   With downsampled SBR, the AAC encoder will work at the same sampling rate as the
941                                                   SBR encoder (single rate).
942                                                   Downsampled SBR is supported for AAC-ELD and HE-AACv1.
943                                                   - 1: Downsampled SBR (default for ELD).
944                                                   - 2: Dual-rate SBR   (default for HE-AAC). */
945
946  AACENC_AFTERBURNER              = 0x0200,  /*!< This parameter controls the use of the afterburner feature.
947                                                  The afterburner is a type of analysis by synthesis algorithm which increases the
948                                                  audio quality but also the required processing power. It is recommended to always
949                                                  activate this if additional memory consumption and processing power consumption
950                                                  is not a problem. If increased MHz and memory consumption are an issue then the MHz
951                                                  and memory cost of this optional module need to be evaluated against the improvement
952                                                  in audio quality on a case by case basis.
953                                                  - 0: Disable afterburner (default).
954                                                  - 1: Enable afterburner. */
955
956  AACENC_BANDWIDTH                = 0x0203,  /*!< Core encoder audio bandwidth:
957                                                  - 0: Determine bandwidth internally (default, see chapter \ref BEHAVIOUR_BANDWIDTH).
958                                                  - 1 to fs/2: Frequency bandwidth in Hertz. (Experts only, better do not
959                                                               touch this value to avoid degraded audio quality) */
960
961  AACENC_TRANSMUX                 = 0x0300,  /*!< Transport type to be used. See ::TRANSPORT_TYPE in FDK_audio.h. Following
962                                                  types can be configured in encoder library:
963                                                  - 0: raw access units
964                                                  - 1: ADIF bitstream format
965                                                  - 2: ADTS bitstream format
966                                                  - 6: Audio Mux Elements (LATM) with muxConfigPresent = 1
967                                                  - 7: Audio Mux Elements (LATM) with muxConfigPresent = 0, out of band StreamMuxConfig
968                                                  - 10: Audio Sync Stream (LOAS) */
969
970  AACENC_HEADER_PERIOD            = 0x0301,  /*!< Frame count period for sending in-band configuration buffers within LATM/LOAS
971                                                  transport layer. Additionally this parameter configures the PCE repetition period
972                                                  in raw_data_block(). See \ref encPCE.
973                                                  - 0xFF: auto-mode default 10 for TT_MP4_ADTS, TT_MP4_LOAS and TT_MP4_LATM_MCP1, otherwise 0.
974                                                  - n: Frame count period. */
975
976  AACENC_SIGNALING_MODE           = 0x0302,  /*!< Signaling mode of the extension AOT:
977                                                  - 0: Implicit backward compatible signaling (default for non-MPEG-4 based
978                                                       AOT's and for the transport formats ADIF and ADTS)
979                                                       - A stream that uses implicit signaling can be decoded by every AAC decoder, even AAC-LC-only decoders
980                                                       - An AAC-LC-only decoder will only decode the low-frequency part of the stream, resulting in a band-limited output
981                                                       - This method works with all transport formats
982                                                       - This method does not work with downsampled SBR
983                                                  - 1: Explicit backward compatible signaling
984                                                       - A stream that uses explicit backward compatible signaling can be decoded by every AAC decoder, even AAC-LC-only decoders
985                                                       - An AAC-LC-only decoder will only decode the low-frequency part of the stream, resulting in a band-limited output
986                                                       - A decoder not capable of decoding PS will only decode the AAC-LC+SBR part.
987                                                         If the stream contained PS, the result will be a a decoded mono downmix
988                                                       - This method does not work with ADIF or ADTS. For LOAS/LATM, it only works with AudioMuxVersion==1
989                                                       - This method does work with downsampled SBR
990                                                  - 2: Explicit hierarchical signaling (default for MPEG-4 based AOT's and for all transport formats excluding ADIF and ADTS)
991                                                       - A stream that uses explicit hierarchical signaling can be decoded only by HE-AAC decoders
992                                                       - An AAC-LC-only decoder will not decode a stream that uses explicit hierarchical signaling
993                                                       - A decoder not capable of decoding PS will not decode the stream at all if it contained PS
994                                                       - This method does not work with ADIF or ADTS. It works with LOAS/LATM and the MPEG-4 File format
995                                                       - This method does work with downsampled SBR
996
997                                                   For making sure that the listener always experiences the best audio quality,
998                                                   explicit hierarchical signaling should be used.
999                                                   This makes sure that only a full HE-AAC-capable decoder will decode those streams.
1000                                                   The audio is played at full bandwidth.
1001                                                   For best backwards compatibility, it is recommended to encode with implicit SBR signaling.
1002                                                   A decoder capable of AAC-LC only will then only decode the AAC part, which means the decoded
1003                                                   audio will sound band-limited.
1004
1005                                                   For MPEG-2 transport types (ADTS,ADIF), only implicit signaling is possible.
1006
1007                                                   For LOAS and LATM, explicit backwards compatible signaling only works together with AudioMuxVersion==1.
1008                                                   The reason is that, for explicit backwards compatible signaling, additional information will be appended to the ASC.
1009                                                   A decoder that is only capable of decoding AAC-LC will skip this part.
1010                                                   Nevertheless, for jumping to the end of the ASC, it needs to know the ASC length.
1011                                                   Transmitting the length of the ASC is a feature of AudioMuxVersion==1, it is not possible to transmit the
1012                                                   length of the ASC with AudioMuxVersion==0, therefore an AAC-LC-only decoder will not be able to parse a
1013                                                   LOAS/LATM stream that was being encoded with AudioMuxVersion==0.
1014
1015                                                   For downsampled SBR, explicit signaling is mandatory. The reason for this is that the
1016                                                   extension sampling frequency (which is in case of SBR the sampling frequqncy of the SBR part)
1017                                                   can only be signaled in explicit mode.
1018
1019                                                   For AAC-ELD, the SBR information is transmitted in the ELDSpecific Config, which is part of the
1020                                                   AudioSpecificConfig. Therefore, the settings here will have no effect on AAC-ELD.*/
1021
1022  AACENC_TPSUBFRAMES              = 0x0303,  /*!< Number of sub frames in a transport frame for LOAS/LATM or ADTS (default 1).
1023                                                  - ADTS: Maximum number of sub frames restricted to 4.
1024                                                  - LOAS/LATM: Maximum number of sub frames restricted to 2.*/
1025
1026  AACENC_PROTECTION               = 0x0306,  /*!< Configure protection in tranpsort layer:
1027                                                  - 0: No protection. (default)
1028                                                  - 1: CRC active for ADTS bitstream format. */
1029
1030  AACENC_ANCILLARY_BITRATE        = 0x0500,  /*!< Constant ancillary data bitrate in bits/second.
1031                                                  - 0: Either no ancillary data or insert exact number of bytes, denoted via
1032                                                       input parameter, numAncBytes in AACENC_InArgs.
1033                                                  - else: Insert ancillary data with specified bitrate. */
1034
1035  AACENC_METADATA_MODE            = 0x0600,  /*!< Configure Meta Data. See ::AACENC_MetaData for further details:
1036                                                  - 0: Do not embed any metadata.
1037                                                  - 1: Embed MPEG defined metadata only.
1038                                                  - 2: Embed all metadata. */
1039
1040  AACENC_CONTROL_STATE            = 0xFF00,  /*!< There is an automatic process which internally reconfigures the encoder instance
1041                                                  when a configuration parameter changed or an error occured. This paramerter allows
1042                                                  overwriting or getting the control status of this process. See ::AACENC_CTRLFLAGS. */
1043
1044  AACENC_NONE                     = 0xFFFF   /*!< ------ */
1045
1046} AACENC_PARAM;
1047
1048
1049#ifdef __cplusplus
1050extern "C" {
1051#endif
1052
1053/**
1054 * \brief  Open an instance of the encoder.
1055 *
1056 * Allocate memory for an encoder instance with a functional range denoted by the function parameters.
1057 * Preinitialize encoder instance with default configuration.
1058 *
1059 * \param phAacEncoder  A pointer to an encoder handle. Initialized on return.
1060 * \param encModules    Specify encoder modules to be supported in this encoder instance:
1061 *                      - 0x0: Allocate memory for all available encoder modules.
1062 *                      - else: Select memory allocation regarding encoder modules. Following flags are possible and can be combined.
1063 *                              - 0x01: AAC module.
1064 *                              - 0x02: SBR module.
1065 *                              - 0x04: PS module.
1066 *                              - 0x10: Metadata module.
1067 *                              - example: (0x01|0x02|0x04|0x10) allocates all modules and is equivalent to default configuration denotet by 0x0.
1068 * \param maxChannels   Number of channels to be allocated. This parameter can be used in different ways:
1069 *                      - 0: Allocate maximum number of AAC and SBR channels as supported by the library.
1070 *                      - nChannels: Use same maximum number of channels for allocating memory in AAC and SBR module.
1071 *                      - nChannels | (nSbrCh<<8): Number of SBR channels can be different to AAC channels to save data memory.
1072 *
1073 * \return
1074 *          - AACENC_OK, on succes.
1075 *          - AACENC_INVALID_HANDLE, AACENC_MEMORY_ERROR, AACENC_INVALID_CONFIG, on failure.
1076 */
1077AACENC_ERROR aacEncOpen(
1078        HANDLE_AACENCODER        *phAacEncoder,
1079        const UINT                encModules,
1080        const UINT                maxChannels
1081        );
1082
1083
1084/**
1085 * \brief  Close the encoder instance.
1086 *
1087 * Deallocate encoder instance and free whole memory.
1088 *
1089 * \param phAacEncoder  Pointer to the encoder handle to be deallocated.
1090 *
1091 * \return
1092 *          - AACENC_OK, on success.
1093 *          - AACENC_INVALID_HANDLE, on failure.
1094 */
1095AACENC_ERROR aacEncClose(
1096        HANDLE_AACENCODER        *phAacEncoder
1097        );
1098
1099
1100/**
1101 * \brief Encode audio data.
1102 *
1103 * This function is mainly for encoding audio data. In addition the function can be used for an encoder (re)configuration
1104 * process.
1105 * - PCM input data will be retrieved from external input buffer until the fill level allows encoding a single frame.
1106 *   This functionality allows an external buffer with reduced size in comparison to the AAC or HE-AAC audio frame length.
1107 * - If the value of the input samples argument is zero, just internal reinitialization will be applied if it is
1108 *   requested.
1109 * - At the end of a file the flushing process can be triggerd via setting the value of the input samples argument to -1.
1110 *   The encoder delay lines are fully flushed when the encoder returns no valid bitstream data AACENC_OutArgs::numOutBytes.
1111 *   Furthermore the end of file is signaled by the return value AACENC_ENCODE_EOF.
1112 * - If an error occured in the previous frame or any of the encoder parameters changed, an internal reinitialization
1113 *   process will be applied before encoding the incoming audio samples.
1114 * - The function can also be used for an independent reconfiguration process without encoding. The first parameter has to be a
1115 *   valid encoder handle and all other parameters can be set to NULL.
1116 * - If the size of the external bitbuffer in outBufDesc is not sufficient for writing the whole bitstream, an internal
1117 *   error will be the return value and a reconfiguration will be triggered.
1118 *
1119 * \param hAacEncoder           A valid AAC encoder handle.
1120 * \param inBufDesc             Input buffer descriptor, see AACENC_BufDesc:
1121 *                              - At least one input buffer with audio data is expected.
1122 *                              - Optionally a second input buffer with ancillary data can be fed.
1123 * \param outBufDesc            Output buffer descriptor, see AACENC_BufDesc:
1124 *                              - Provide one output buffer for the encoded bitstream.
1125 * \param inargs                Input arguments, see AACENC_InArgs.
1126 * \param outargs               Output arguments, AACENC_OutArgs.
1127 *
1128 * \return
1129 *          - AACENC_OK, on success.
1130 *          - AACENC_INVALID_HANDLE, AACENC_ENCODE_ERROR, on failure in encoding process.
1131 *          - AACENC_INVALID_CONFIG, AACENC_INIT_ERROR, AACENC_INIT_AAC_ERROR, AACENC_INIT_SBR_ERROR, AACENC_INIT_TP_ERROR,
1132 *            AACENC_INIT_META_ERROR, on failure in encoder initialization.
1133 *          - AACENC_ENCODE_EOF, when flushing fully concluded.
1134 */
1135AACENC_ERROR aacEncEncode(
1136        const HANDLE_AACENCODER   hAacEncoder,
1137        const AACENC_BufDesc     *inBufDesc,
1138        const AACENC_BufDesc     *outBufDesc,
1139        const AACENC_InArgs      *inargs,
1140        AACENC_OutArgs           *outargs
1141        );
1142
1143
1144/**
1145 * \brief  Acquire info about present encoder instance.
1146 *
1147 * This function retrieves information of the encoder configuration. In addition to informative internal states,
1148 * a configuration data block of the current encoder settings will be returned. The format is either Audio Specific Config
1149 * in case of Raw Packets transport format or StreamMuxConfig in case of LOAS/LATM transport format. The configuration
1150 * data block is binary coded as specified in ISO/IEC 14496-3 (MPEG-4 audio), to be used directly for MPEG-4 File Format
1151 * or RFC3016 or RFC3640 applications.
1152 *
1153 * \param hAacEncoder           A valid AAC encoder handle.
1154 * \param pInfo                 Pointer to AACENC_InfoStruct. Filled on return.
1155 *
1156 * \return
1157 *          - AACENC_OK, on succes.
1158 *          - AACENC_INIT_ERROR, on failure.
1159 */
1160AACENC_ERROR aacEncInfo(
1161        const HANDLE_AACENCODER   hAacEncoder,
1162        AACENC_InfoStruct        *pInfo
1163        );
1164
1165
1166/**
1167 * \brief  Set one single AAC encoder parameter.
1168 *
1169 * This function allows configuration of all encoder parameters specified in ::AACENC_PARAM. Each parameter must be
1170 * set with a separate function call. An internal validation of the configuration value range will be done and an
1171 * internal reconfiguration will be signaled. The actual configuration adoption is part of the subsequent aacEncEncode() call.
1172 *
1173 * \param hAacEncoder           A valid AAC encoder handle.
1174 * \param param                 Parameter to be set. See ::AACENC_PARAM.
1175 * \param value                 Parameter value. See parameter description in ::AACENC_PARAM.
1176 *
1177 * \return
1178 *          - AACENC_OK, on success.
1179 *          - AACENC_INVALID_HANDLE, AACENC_UNSUPPORTED_PARAMETER, AACENC_INVALID_CONFIG, on failure.
1180 */
1181AACENC_ERROR aacEncoder_SetParam(
1182        const HANDLE_AACENCODER   hAacEncoder,
1183        const AACENC_PARAM        param,
1184        const UINT                value
1185        );
1186
1187
1188/**
1189 * \brief  Get one single AAC encoder parameter.
1190 *
1191 * This function is the complement to aacEncoder_SetParam(). After encoder reinitialization with user defined settings,
1192 * the internal status can be obtained of each parameter, specified with ::AACENC_PARAM.
1193 *
1194 * \param hAacEncoder           A valid AAC encoder handle.
1195 * \param param                 Parameter to be returned. See ::AACENC_PARAM.
1196 *
1197 * \return  Internal configuration value of specifed parameter ::AACENC_PARAM.
1198 */
1199UINT aacEncoder_GetParam(
1200        const HANDLE_AACENCODER   hAacEncoder,
1201        const AACENC_PARAM        param
1202        );
1203
1204
1205/**
1206 * \brief  Get information about encoder library build.
1207 *
1208 * Fill a given LIB_INFO structure with library version information.
1209 *
1210 * \param info  Pointer to an allocated LIB_INFO struct.
1211 *
1212 * \return
1213 *          - AACENC_OK, on success.
1214 *          - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure.
1215 */
1216AACENC_ERROR aacEncGetLibInfo(
1217        LIB_INFO                 *info
1218        );
1219
1220
1221#ifdef __cplusplus
1222}
1223#endif
1224
1225#endif   /* _AAC_ENC_LIB_H_ */
1226