| /* ----------------------------------------------------------------------------- |
| Software License for The Fraunhofer FDK AAC Codec Library for Android |
| |
| © Copyright 1995 - 2021 Fraunhofer-Gesellschaft zur Förderung der angewandten |
| Forschung e.V. All rights reserved. |
| |
| 1. INTRODUCTION |
| The Fraunhofer FDK AAC Codec Library for Android ("FDK AAC Codec") is software |
| that implements the MPEG Advanced Audio Coding ("AAC") encoding and decoding |
| scheme for digital audio. This FDK AAC Codec software is intended to be used on |
| a wide variety of Android devices. |
| |
| AAC's HE-AAC and HE-AAC v2 versions are regarded as today's most efficient |
| general perceptual audio codecs. AAC-ELD is considered the best-performing |
| full-bandwidth communications codec by independent studies and is widely |
| deployed. AAC has been standardized by ISO and IEC as part of the MPEG |
| specifications. |
| |
| Patent licenses for necessary patent claims for the FDK AAC Codec (including |
| those of Fraunhofer) may be obtained through Via Licensing |
| (www.vialicensing.com) or through the respective patent owners individually for |
| the purpose of encoding or decoding bit streams in products that are compliant |
| with the ISO/IEC MPEG audio standards. Please note that most manufacturers of |
| Android devices already license these patent claims through Via Licensing or |
| directly from the patent owners, and therefore FDK AAC Codec software may |
| already be covered under those patent licenses when it is used for those |
| licensed purposes only. |
| |
| Commercially-licensed AAC software libraries, including floating-point versions |
| with enhanced sound quality, are also available from Fraunhofer. Users are |
| encouraged to check the Fraunhofer website for additional applications |
| information and documentation. |
| |
| 2. COPYRIGHT LICENSE |
| |
| Redistribution and use in source and binary forms, with or without modification, |
| are permitted without payment of copyright license fees provided that you |
| satisfy the following conditions: |
| |
| You must retain the complete text of this software license in redistributions of |
| the FDK AAC Codec or your modifications thereto in source code form. |
| |
| You must retain the complete text of this software license in the documentation |
| and/or other materials provided with redistributions of the FDK AAC Codec or |
| your modifications thereto in binary form. You must make available free of |
| charge copies of the complete source code of the FDK AAC Codec and your |
| modifications thereto to recipients of copies in binary form. |
| |
| The name of Fraunhofer may not be used to endorse or promote products derived |
| from this library without prior written permission. |
| |
| You may not charge copyright license fees for anyone to use, copy or distribute |
| the FDK AAC Codec software or your modifications thereto. |
| |
| Your modified versions of the FDK AAC Codec must carry prominent notices stating |
| that you changed the software and the date of any change. For modified versions |
| of the FDK AAC Codec, the term "Fraunhofer FDK AAC Codec Library for Android" |
| must be replaced by the term "Third-Party Modified Version of the Fraunhofer FDK |
| AAC Codec Library for Android." |
| |
| 3. NO PATENT LICENSE |
| |
| NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without |
| limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE. |
| Fraunhofer provides no warranty of patent non-infringement with respect to this |
| software. |
| |
| You may use this FDK AAC Codec software or modifications thereto only for |
| purposes that are authorized by appropriate patent licenses. |
| |
| 4. DISCLAIMER |
| |
| This FDK AAC Codec software is provided by Fraunhofer on behalf of the copyright |
| holders and contributors "AS IS" and WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, |
| including but not limited to the implied warranties of merchantability and |
| fitness for a particular purpose. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR |
| CONTRIBUTORS BE LIABLE for any direct, indirect, incidental, special, exemplary, |
| or consequential damages, including but not limited to procurement of substitute |
| goods or services; loss of use, data, or profits, or business interruption, |
| however caused and on any theory of liability, whether in contract, strict |
| liability, or tort (including negligence), arising in any way out of the use of |
| this software, even if advised of the possibility of such damage. |
| |
| 5. CONTACT INFORMATION |
| |
| Fraunhofer Institute for Integrated Circuits IIS |
| Attention: Audio and Multimedia Departments - FDK AAC LL |
| Am Wolfsmantel 33 |
| 91058 Erlangen, Germany |
| |
| www.iis.fraunhofer.de/amm |
| amm-info@iis.fraunhofer.de |
| ----------------------------------------------------------------------------- */ |
| |
| /**************************** AAC encoder library ****************************** |
| |
| Author(s): M. Lohwasser |
| |
| Description: |
| |
| *******************************************************************************/ |
| |
| /** |
| * \file aacenc_lib.h |
| * \brief FDK AAC Encoder library interface header file. |
| * |
| \mainpage Introduction |
| |
| \section Scope |
| |
| This document describes the high-level interface and usage of the ISO/MPEG-2/4 |
| AAC Encoder library developed by the Fraunhofer Institute for Integrated |
| Circuits (IIS). |
| |
| The library implements encoding on the basis of the MPEG-2 and MPEG-4 AAC |
| Low-Complexity standard, and depending on the library's configuration, MPEG-4 |
| High-Efficiency AAC v2 and/or AAC-ELD standard. |
| |
| All references to SBR (Spectral Band Replication) are only applicable to HE-AAC |
| or AAC-ELD versions of the library. All references to PS (Parametric Stereo) are |
| only applicable to HE-AAC v2 versions of the library. |
| |
| \section encBasics Encoder Basics |
| |
| This document can only give a rough overview about the ISO/MPEG-2 and ISO/MPEG-4 |
| AAC audio coding standard. To understand all the terms in this document, you are |
| encouraged to read the following documents. |
| |
| - ISO/IEC 13818-7 (MPEG-2 AAC), which defines the syntax of MPEG-2 AAC audio |
| bitstreams. |
| - ISO/IEC 14496-3 (MPEG-4 AAC, subparts 1 and 4), which defines the syntax of |
| MPEG-4 AAC audio bitstreams. |
| - Lutzky, Schuller, Gayer, Krämer, Wabnik, "A guideline to audio codec |
| delay", 116th AES Convention, May 8, 2004 |
| |
| MPEG Advanced Audio Coding is based on a time-to-frequency mapping of the |
| signal. The signal is partitioned into overlapping portions and transformed into |
| frequency domain. The spectral components are then quantized and coded. \n An |
| MPEG-2 or MPEG-4 AAC audio bitstream is composed of frames. Contrary to MPEG-1/2 |
| Layer-3 (mp3), the length of individual frames is not restricted to a fixed |
| number of bytes, but can take on any length between 1 and 768 bytes. |
| |
| |
| \page LIBUSE Library Usage |
| |
| \section InterfaceDescription API Files |
| |
| All API header files are located in the folder /include of the release package. |
| All header files are provided for usage in C/C++ programs. The AAC encoder |
| library API functions are located in aacenc_lib.h. |
| |
| \section CallingSequence Calling Sequence |
| |
| For encoding of ISO/MPEG-2/4 AAC bitstreams the following sequence is mandatory. |
| Input read and output write functions as well as the corresponding open and |
| close functions are left out, since they may be implemented differently |
| according to the user's specific requirements. The example implementation uses |
| file-based input/output. |
| |
| -# Call aacEncOpen() to allocate encoder instance with required \ref encOpen |
| "configuration". \code HANDLE_AACENCODER hAacEncoder = NULL; if ( (ErrorStatus = |
| aacEncOpen(&hAacEncoder,0,0)) != AACENC_OK ) { \endcode |
| -# Call aacEncoder_SetParam() for each parameter to be set. AOT, samplingrate, |
| channelMode, bitrate and transport type are \ref encParams "mandatory". \code |
| ErrorStatus = aacEncoder_SetParam(hAacEncoder, parameter, value); |
| \endcode |
| -# Call aacEncEncode() with NULL parameters to \ref encReconf "initialize" |
| encoder instance with present parameter set. \code ErrorStatus = |
| aacEncEncode(hAacEncoder, NULL, NULL, NULL, NULL); \endcode |
| -# Call aacEncInfo() to retrieve a configuration data block to be transmitted |
| out of band. This is required when using RFC3640 or RFC3016 like transport. |
| \code |
| AACENC_InfoStruct encInfo; |
| aacEncInfo(hAacEncoder, &encInfo); |
| \endcode |
| -# Encode input audio data in loop. |
| \code |
| do |
| { |
| \endcode |
| Feed \ref feedInBuf "input buffer" with new audio data and provide input/output |
| \ref bufDes "arguments" to aacEncEncode(). \code ErrorStatus = |
| aacEncEncode(hAacEncoder, &inBufDesc, &outBufDesc, &inargs, &outargs); \endcode |
| Write \ref writeOutData "output data" to file or audio device. |
| \code |
| } while (ErrorStatus==AACENC_OK); |
| \endcode |
| -# Call aacEncClose() and destroy encoder instance. |
| \code |
| aacEncClose(&hAacEncoder); |
| \endcode |
| |
| |
| \section encOpen Encoder Instance Allocation |
| |
| The assignment of the aacEncOpen() function is very flexible and can be used in |
| the following way. |
| - If the amount of memory consumption is not an issue, the encoder instance can |
| be allocated for the maximum number of possible audio channels (for example 6 or |
| 8) with the full functional range supported by the library. This is the default |
| open procedure for the AAC encoder if memory consumption does not need to be |
| minimized. \code aacEncOpen(&hAacEncoder,0,0) \endcode |
| - If the required MPEG-4 AOTs do not call for the full functional range of the |
| library, encoder modules can be allocated selectively. \verbatim |
| ------------------------------------------------------ |
| AAC | SBR | PS | MD | FLAGS | value |
| -----+-----+-----+----+-----------------------+------- |
| X | - | - | - | (0x01) | 0x01 |
| X | X | - | - | (0x01|0x02) | 0x03 |
| X | X | X | - | (0x01|0x02|0x04) | 0x07 |
| X | - | - | X | (0x01 |0x10) | 0x11 |
| X | X | - | X | (0x01|0x02 |0x10) | 0x13 |
| X | X | X | X | (0x01|0x02|0x04|0x10) | 0x17 |
| ------------------------------------------------------ |
| - AAC: Allocate AAC Core Encoder module. |
| - SBR: Allocate Spectral Band Replication module. |
| - PS: Allocate Parametric Stereo module. |
| - MD: Allocate Meta Data module within AAC encoder. |
| \endverbatim |
| \code aacEncOpen(&hAacEncoder,value,0) \endcode |
| - Specifying the maximum number of channels to be supported in the encoder |
| instance can be done as follows. |
| - For example allocate an encoder instance which supports 2 channels for all |
| supported AOTs. The library itself may be capable of encoding up to 6 or 8 |
| channels but in this example only 2 channel encoding is required and thus only |
| buffers for 2 channels are allocated to save data memory. \code |
| aacEncOpen(&hAacEncoder,0,2) \endcode |
| - Additionally the maximum number of supported channels in the SBR module can |
| be denoted separately.\n In this example the encoder instance provides a maximum |
| of 6 channels out of which up to 2 channels support SBR. This encoder instance |
| can produce for example 5.1 channel AAC-LC streams or stereo HE-AAC (v2) |
| streams. HE-AAC 5.1 multi channel is not possible since only 2 out of 6 channels |
| support SBR, which saves data memory. \code aacEncOpen(&hAacEncoder,0,6|(2<<8)) |
| \endcode \n |
| |
| \section bufDes Input/Output Arguments |
| |
| \subsection allocIOBufs Provide Buffer Descriptors |
| In the present encoder API, the input and output buffers are described with \ref |
| AACENC_BufDesc "buffer descriptors". This mechanism allows a flexible handling |
| of input and output buffers without impact to the actual encoding call. Optional |
| buffers are necessary e.g. for ancillary data, meta data input or additional |
| output buffers describing superframing data in DAB+ or DRM+.\n At least one |
| input buffer for audio input data and one output buffer for bitstream data must |
| be allocated. The input buffer size can be a user defined multiple of the number |
| of input channels. PCM input data will be copied from the user defined PCM |
| buffer to an internal input buffer and so input data can be less than one AAC |
| audio frame. The output buffer size should be 6144 bits per channel excluding |
| the LFE channel. If the output data does not fit into the provided buffer, an |
| AACENC_ERROR will be returned by aacEncEncode(). \code static INT_PCM |
| inputBuffer[8*2048]; static UCHAR ancillaryBuffer[50]; static |
| AACENC_MetaData metaDataSetup; static UCHAR outputBuffer[8192]; |
| \endcode |
| |
| All input and output buffer must be clustered in input and output buffer arrays. |
| \code |
| static void* inBuffer[] = { inputBuffer, ancillaryBuffer, &metaDataSetup |
| }; static INT inBufferIds[] = { IN_AUDIO_DATA, IN_ANCILLRY_DATA, |
| IN_METADATA_SETUP }; static INT inBufferSize[] = { sizeof(inputBuffer), |
| sizeof(ancillaryBuffer), sizeof(metaDataSetup) }; static INT inBufferElSize[] |
| = { sizeof(INT_PCM), sizeof(UCHAR), sizeof(AACENC_MetaData) }; |
| |
| static void* outBuffer[] = { outputBuffer }; |
| static INT outBufferIds[] = { OUT_BITSTREAM_DATA }; |
| static INT outBufferSize[] = { sizeof(outputBuffer) }; |
| static INT outBufferElSize[] = { sizeof(UCHAR) }; |
| \endcode |
| |
| Allocate buffer descriptors |
| \code |
| AACENC_BufDesc inBufDesc; |
| AACENC_BufDesc outBufDesc; |
| \endcode |
| |
| Initialize input buffer descriptor |
| \code |
| inBufDesc.numBufs = sizeof(inBuffer)/sizeof(void*); |
| inBufDesc.bufs = (void**)&inBuffer; |
| inBufDesc.bufferIdentifiers = inBufferIds; |
| inBufDesc.bufSizes = inBufferSize; |
| inBufDesc.bufElSizes = inBufferElSize; |
| \endcode |
| |
| Initialize output buffer descriptor |
| \code |
| outBufDesc.numBufs = sizeof(outBuffer)/sizeof(void*); |
| outBufDesc.bufs = (void**)&outBuffer; |
| outBufDesc.bufferIdentifiers = outBufferIds; |
| outBufDesc.bufSizes = outBufferSize; |
| outBufDesc.bufElSizes = outBufferElSize; |
| \endcode |
| |
| \subsection argLists Provide Input/Output Argument Lists |
| The input and output arguments of an aacEncEncode() call are described in |
| argument structures. \code AACENC_InArgs inargs; AACENC_OutArgs outargs; |
| \endcode |
| |
| \section feedInBuf Feed Input Buffer |
| The input buffer should be handled as a modulo buffer. New audio data in the |
| form of pulse-code- modulated samples (PCM) must be read from external and be |
| fed to the input buffer depending on its fill level. The required sample bitrate |
| (represented by the data type INT_PCM which is 16, 24 or 32 bits wide) is fixed |
| and depends on library configuration (usually 16 bit). \code inargs.numInSamples |
| += WAV_InputRead ( wavIn, &inputBuffer[inargs.numInSamples], |
| FDKmin(encInfo.inputChannels*encInfo.frameLength, |
| sizeof(inputBuffer) / |
| sizeof(INT_PCM)-inargs.numInSamples), |
| SAMPLE_BITS |
| ); |
| \endcode |
| |
| After the encoder's internal buffer is fed with incoming audio samples, and |
| aacEncEncode() processed the new input data, update/move remaining samples in |
| input buffer, simulating a modulo buffer: \code if (outargs.numInSamples>0) { |
| FDKmemmove( inputBuffer, |
| &inputBuffer[outargs.numInSamples], |
| sizeof(INT_PCM)*(inargs.numInSamples-outargs.numInSamples) ); |
| inargs.numInSamples -= outargs.numInSamples; |
| } |
| \endcode |
| |
| \section writeOutData Output Bitstream Data |
| If any AAC bitstream data is available, write it to output file or device as |
| follows. \code if (outargs.numOutBytes>0) { FDKfwrite(outputBuffer, |
| outargs.numOutBytes, 1, pOutFile); |
| } |
| \endcode |
| |
| \section cfgMetaData Meta Data Configuration |
| |
| If the present library is configured with Metadata support, it is possible to |
| insert meta data side info into the generated audio bitstream while encoding. |
| |
| To work with meta data the encoder instance has to be \ref encOpen "allocated" |
| with meta data support. The meta data mode must be be configured with the |
| ::AACENC_METADATA_MODE parameter and aacEncoder_SetParam() function. \code |
| aacEncoder_SetParam(hAacEncoder, AACENC_METADATA_MODE, 0-3); \endcode |
| |
| This configuration indicates how to embed meta data into bitstrem. Either no |
| insertion, MPEG or ETSI style. The meta data itself must be specified within the |
| meta data setup structure AACENC_MetaData. |
| |
| Changing one of the AACENC_MetaData setup parameters can be achieved from |
| outside the library within ::IN_METADATA_SETUP input buffer. There is no need to |
| supply meta data setup structure every frame. If there is no new meta setup data |
| available, the encoder uses the previous setup or the default configuration in |
| initial state. |
| |
| In general the audio compressor and limiter within the encoder library can be |
| configured with the ::AACENC_METADATA_DRC_PROFILE parameter |
| AACENC_MetaData::drc_profile and and AACENC_MetaData::comp_profile. |
| \n |
| |
| \section encReconf Encoder Reconfiguration |
| |
| The encoder library allows reconfiguration of the encoder instance with new |
| settings continuously between encoding frames. Each parameter to be changed must |
| be set with a single aacEncoder_SetParam() call. The internal status of each |
| parameter can be retrieved with an aacEncoder_GetParam() call.\n There is no |
| stand-alone reconfiguration function available. When parameters were modified |
| from outside the library, an internal control mechanism triggers the necessary |
| reconfiguration process which will be applied at the beginning of the following |
| aacEncEncode() call. This state can be observed from external via the |
| AACENC_INIT_STATUS and aacEncoder_GetParam() function. The reconfiguration |
| process can also be applied immediately when all parameters of an aacEncEncode() |
| call are NULL with a valid encoder handle.\n\n The internal reconfiguration |
| process can be controlled from extern with the following access. \code |
| aacEncoder_SetParam(hAacEncoder, AACENC_CONTROL_STATE, AACENC_CTRLFLAGS); |
| \endcode |
| |
| |
| \section encParams Encoder Parametrization |
| |
| All parameteres listed in ::AACENC_PARAM can be modified within an encoder |
| instance. |
| |
| \subsection encMandatory Mandatory Encoder Parameters |
| The following parameters must be specified when the encoder instance is |
| initialized. \code aacEncoder_SetParam(hAacEncoder, AACENC_AOT, value); |
| aacEncoder_SetParam(hAacEncoder, AACENC_BITRATE, value); |
| aacEncoder_SetParam(hAacEncoder, AACENC_SAMPLERATE, value); |
| aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value); |
| \endcode |
| Beyond that is an internal auto mode which preinitizializes the ::AACENC_BITRATE |
| parameter if the parameter was not set from extern. The bitrate depends on the |
| number of effective channels and sampling rate and is determined as follows. |
| \code |
| AAC-LC (AOT_AAC_LC): 1.5 bits per sample |
| HE-AAC (AOT_SBR): 0.625 bits per sample (dualrate sbr) |
| HE-AAC (AOT_SBR): 1.125 bits per sample (downsampled sbr) |
| HE-AAC v2 (AOT_PS): 0.5 bits per sample |
| \endcode |
| |
| \subsection channelMode Channel Mode Configuration |
| The input audio data is described with the ::AACENC_CHANNELMODE parameter in the |
| aacEncoder_SetParam() call. It is not possible to use the encoder instance with |
| a 'number of input channels' argument. Instead, the channelMode must be set as |
| follows. \code aacEncoder_SetParam(hAacEncoder, AACENC_CHANNELMODE, value); |
| \endcode The parameter is specified in ::CHANNEL_MODE and can be mapped from the |
| number of input channels in the following way. \code CHANNEL_MODE chMode = |
| MODE_INVALID; |
| |
| switch (nChannels) { |
| case 1: chMode = MODE_1; break; |
| case 2: chMode = MODE_2; break; |
| case 3: chMode = MODE_1_2; break; |
| case 4: chMode = MODE_1_2_1; break; |
| case 5: chMode = MODE_1_2_2; break; |
| case 6: chMode = MODE_1_2_2_1; break; |
| case 7: chMode = MODE_6_1; break; |
| case 8: chMode = MODE_7_1_BACK; break; |
| default: |
| chMode = MODE_INVALID; |
| } |
| return chMode; |
| \endcode |
| |
| \subsection peakbitrate Peak Bitrate Configuration |
| In AAC, the default bitreservoir configuration depends on the chosen bitrate per |
| frame and the number of effective channels. The size can be determined as below. |
| \f[ |
| bitreservoir = nEffChannels*6144 - (bitrate*framelength/samplerate) |
| \f] |
| Due to audio quality concerns it is not recommended to change the bitreservoir |
| size to a lower value than the default setting! However, for minimizing the |
| delay for streaming applications or for achieving a constant size of the |
| bitstream packages in each frame, it may be necessaray to limit the maximum bits |
| per frame size. This can be done with the ::AACENC_PEAK_BITRATE parameter. \code |
| aacEncoder_SetParam(hAacEncoder, AACENC_PEAK_BITRATE, value); |
| \endcode |
| |
| To achieve acceptable audio quality with a reduced bitreservoir size setting at |
| least 1000 bits per audio channel is recommended. For a multichannel audio file |
| with 5.1 channels the bitreservoir reduced to 5000 bits results in acceptable |
| audio quality. |
| |
| |
| \subsection vbrmode Variable Bitrate Mode |
| The variable bitrate (VBR) mode coding adapts the bit consumption to the |
| psychoacoustic requirements of the signal. The encoder ignores the user-defined |
| bit rate and selects a suitable pre-defined configuration based on the provided |
| AOT. The VBR mode 1 is tuned for HE-AACv2, for VBR mode 2, HE-AACv1 should be |
| used. VBR modes 3-5 should be used with Low-Complexity AAC. When encoding |
| AAC-ELD, the best mode is selected automatically. |
| |
| The bitrates given in the table are averages over time and different encoder |
| settings. They strongly depend on the type of audio signal. The VBR |
| configurations can be adjusted with the ::AACENC_BITRATEMODE encoder parameter. |
| \verbatim |
| ----------------------------------------------- |
| VBR_MODE | Approx. Bitrate in kbps for stereo |
| | AAC-LC | AAC-ELD |
| ----------+---------------+-------------------- |
| VBR_1 | 32 (HE-AACv2) | 48 |
| VBR_2 | 72 (HE-AACv1) | 56 |
| VBR_3 | 112 | 72 |
| VBR_4 | 148 | 148 |
| VBR_5 | 228 | 224 |
| -------------------------------------------- |
| \endverbatim |
| Note that these figures are valid for stereo encoding only. VBR modes 2-5 will |
| yield much lower bit rates when encoding single-channel input. For |
| configurations which are making use of downmix modules the AAC core channels |
| respectively downmix channels shall be considered. |
| |
| \subsection encQual Audio Quality Considerations |
| The default encoder configuration is suggested to be used. Encoder tools such as |
| TNS and PNS are activated by default and are internally controlled (see \ref |
| BEHAVIOUR_TOOLS). |
| |
| There is an additional quality parameter called ::AACENC_AFTERBURNER. In the |
| default configuration this quality switch is deactivated because it would cause |
| a workload increase which might be significant. If workload is not an issue in |
| the application we recommended to activate this feature. \code |
| aacEncoder_SetParam(hAacEncoder, AACENC_AFTERBURNER, 0/1); \endcode |
| |
| \subsection encELD ELD Auto Configuration Mode |
| For ELD configuration a so called auto configurator is available which |
| configures SBR and the SBR ratio by itself. The configurator is used when the |
| encoder parameter ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO are not set |
| explicitly. |
| |
| Based on sampling rate and chosen bitrate a reasonable SBR configuration will be |
| used. \verbatim |
| ------------------------------------------------------------------ |
| Sampling Rate | Total Bitrate | No. of | SBR | SBR Ratio |
| [kHz] | [bit/s] | Chan | | |
| | | | | |
| ---------------+-----------------+--------+-----+----------------- |
| ]min, 16[ | min - max | 1 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| [16] | min - 27999 | 1 | on | downsampled SBR |
| | 28000 - max | 1 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]16 - 24] | min - 39999 | 1 | on | downsampled SBR |
| | 40000 - max | 1 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]24 - 32] | min - 27999 | 1 | on | dualrate SBR |
| | 28000 - 55999 | 1 | on | downsampled SBR |
| | 56000 - max | 1 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]32 - 44.1] | min - 63999 | 1 | on | dualrate SBR |
| | 64000 - max | 1 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]44.1 - 48] | min - 63999 | 1 | on | dualrate SBR |
| | 64000 - max | 1 | off | --- |
| | | | | |
| ---------------+-----------------+--------+-----+----------------- |
| ]min, 16[ | min - max | 2 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| [16] | min - 31999 | 2 | on | downsampled SBR |
| | 32000 - 63999 | 2 | on | downsampled SBR |
| | 64000 - max | 2 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]16 - 24] | min - 47999 | 2 | on | downsampled SBR |
| | 48000 - 79999 | 2 | on | downsampled SBR |
| | 80000 - max | 2 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]24 - 32] | min - 31999 | 2 | on | dualrate SBR |
| | 32000 - 67999 | 2 | on | dualrate SBR |
| | 68000 - 95999 | 2 | on | downsampled SBR |
| | 96000 - max | 2 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]32 - 44.1] | min - 43999 | 2 | on | dualrate SBR |
| | 44000 - 127999 | 2 | on | dualrate SBR |
| | 128000 - max | 2 | off | --- |
| ---------------+-----------------+--------------+----------------- |
| ]44.1 - 48] | min - 43999 | 2 | on | dualrate SBR |
| | 44000 - 127999 | 2 | on | dualrate SBR |
| | 128000 - max | 2 | off | --- |
| | | | |
| ------------------------------------------------------------------ |
| \endverbatim |
| |
| \subsection encDsELD Reduced Delay (Downscaled) Mode |
| The downscaled mode of AAC-ELD reduces the algorithmic delay of AAC-ELD by |
| virtually increasing the sampling rate. When using the downscaled mode, the |
| bitrate should be increased for keeping the same audio quality level. For common |
| signals, the bitrate should be increased by 25% for a downscale factor of 2. |
| |
| Currently, downscaling factors 2 and 4 are supported. |
| To enable the downscaled mode in the encoder, the framelength parameter |
| AACENC_GRANULE_LENGTH must be set accordingly to 256 or 240 for a downscale |
| factor of 2 or 128 or 120 for a downscale factor of 4. The default values of 512 |
| or 480 mean that no downscaling is applied. \code |
| aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 256); |
| aacEncoder_SetParam(hAacEncoder, AACENC_GRANULE_LENGTH, 128); |
| \endcode |
| |
| Downscaled bitstreams are fully backwards compatible. However, the legacy |
| decoder needs to support high sample rate, e.g. 96kHz. The signaled sampling |
| rate is multiplied by the downscale factor. Although not required, downscaling |
| should be applied when decoding downscaled bitstreams. It reduces CPU workload |
| and the output will have the same sampling rate as the input. In an ideal |
| configuration both encoder and decoder should run with the same downscale |
| factor. |
| |
| The following table shows approximate filter bank delays in ms for common |
| sampling rates(sr) at framesize(fs), and downscale factor(dsf), based on this |
| formula: \f[ 1000 * fs / (dsf * sr) \f] |
| |
| \verbatim |
| -------------------------------------- |
| | 512/2 | 512/4 | 480/2 | 480/4 |
| ------+-------+-------+-------+------- |
| 22050 | 17.41 | 8.71 | 16.33 | 8.16 |
| 32000 | 12.00 | 6.00 | 11.25 | 5.62 |
| 44100 | 8.71 | 4.35 | 8.16 | 4.08 |
| 48000 | 8.00 | 4.00 | 7.50 | 3.75 |
| -------------------------------------- |
| \endverbatim |
| |
| \section audiochCfg Audio Channel Configuration |
| The MPEG standard refers often to the so-called Channel Configuration. This |
| Channel Configuration is used for a fixed Channel Mapping. The configurations |
| 1-7 and 11,12,14 are predefined in MPEG standard and used for implicit |
| signalling within the encoded bitstream. For user defined Configurations the |
| Channel Configuration is set to 0 and the Channel Mapping must be explecitly |
| described with an appropriate Program Config Element. The present Encoder |
| implementation does not allow the user to configure this Channel Configuration |
| from extern. The Encoder implementation supports fixed Channel Modes which are |
| mapped to Channel Configuration as follow. \verbatim |
| ---------------------------------------------------------------------------------------- |
| ChannelMode | ChCfg | Height | front_El | side_El | back_El | |
| lfe_El |
| -----------------------+-------+--------+---------------+----------+----------+--------- |
| MODE_1 | 1 | NORM | SCE | | | |
| MODE_2 | 2 | NORM | CPE | | | |
| MODE_1_2 | 3 | NORM | SCE, CPE | | | |
| MODE_1_2_1 | 4 | NORM | SCE, CPE | | SCE | |
| MODE_1_2_2 | 5 | NORM | SCE, CPE | | CPE | |
| MODE_1_2_2_1 | 6 | NORM | SCE, CPE | | CPE | |
| LFE MODE_1_2_2_2_1 | 7 | NORM | SCE, CPE, CPE | | CPE |
| | LFE MODE_6_1 | 11 | NORM | SCE, CPE | | CPE, |
| SCE | LFE MODE_7_1_BACK | 12 | NORM | SCE, CPE | | |
| CPE, CPE | LFE |
| -----------------------+-------+--------+---------------+----------+----------+--------- |
| MODE_7_1_TOP_FRONT | 14 | NORM | SCE, CPE | | CPE | |
| LFE | | TOP | CPE | | | |
| -----------------------+-------+--------+---------------+----------+----------+--------- |
| MODE_7_1_REAR_SURROUND | 0 | NORM | SCE, CPE | | CPE, CPE | |
| LFE MODE_7_1_FRONT_CENTER | 0 | NORM | SCE, CPE, CPE | | CPE |
| | LFE |
| ---------------------------------------------------------------------------------------- |
| - NORM: Normal Height Layer. - TOP: Top Height Layer. - BTM: Bottom Height |
| Layer. |
| - SCE: Single Channel Element. - CPE: Channel Pair. - LFE: Low Frequency |
| Element. \endverbatim |
| |
| The Table describes all fixed Channel Elements for each Channel Mode which are |
| assigned to a speaker arrangement. The arrangement includes front, side, back |
| and lfe Audio Channel Elements in the normal height layer, possibly followed by |
| front, side, and back elements in the top and bottom layer (Channel |
| Configuration 14). \n This mapping of Audio Channel Elements is defined in MPEG |
| standard for Channel Config 1-7 and 11,12,14.\n In case of Channel Config 0 or |
| writing matrix mixdown coefficients, the encoder enables the writing of Program |
| Config Element itself as described in \ref encPCE. The configuration used in |
| Program Config Element refers to the denoted Table.\n Beside the Channel Element |
| assignment the Channel Modes are resposible for audio input data channel |
| mapping. The Channel Mapping of the audio data depends on the selected |
| ::AACENC_CHANNELORDER which can be MPEG or WAV like order.\n Following table |
| describes the complete channel mapping for both Channel Order configurations. |
| \verbatim |
| --------------------------------------------------------------------------------------- |
| ChannelMode | MPEG-Channelorder | WAV-Channelorder |
| -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--- |
| MODE_1 | 0 | | | | | | | | 0 | | | | | | |
| | MODE_2 | 0 | 1 | | | | | | | 0 | 1 | | | | |
| | | MODE_1_2 | 0 | 1 | 2 | | | | | | 2 | 0 | 1 | | |
| | | | MODE_1_2_1 | 0 | 1 | 2 | 3 | | | | | 2 | 0 | 1 | 3 |
| | | | | MODE_1_2_2 | 0 | 1 | 2 | 3 | 4 | | | | 2 | 0 | 1 |
| | 3 | 4 | | | MODE_1_2_2_1 | 0 | 1 | 2 | 3 | 4 | 5 | | | 2 | 0 |
| | 1 | 4 | 5 | 3 | | MODE_1_2_2_2_1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 |
| | 6 | 7 | 0 | 1 | 4 | 5 | 3 MODE_6_1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | |
| | 2 | 0 | 1 | 4 | 5 | 6 | 3 | MODE_7_1_BACK | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
| | 7 | 2 | 0 | 1 | 6 | 7 | 4 | 5 | 3 MODE_7_1_TOP_FRONT | 0 | 1 | 2 | 3 | 4 | |
| 5 | 6 | 7 | 2 | 0 | 1 | 4 | 5 | 3 | 6 | 7 |
| -----------------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--- |
| MODE_7_1_REAR_SURROUND | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 0 | 1 | 6 | 7 | 4 | |
| 5 | 3 MODE_7_1_FRONT_CENTER | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 2 | 6 | 7 | 0 | 1 |
| | 4 | 5 | 3 |
| --------------------------------------------------------------------------------------- |
| \endverbatim |
| |
| The denoted mapping is important for correct audio channel assignment when using |
| MPEG or WAV ordering. The incoming audio channels are distributed MPEG like |
| starting at the front channels and ending at the back channels. The distribution |
| is used as described in Table concering Channel Config and fix channel elements. |
| Please see the following example for clarification. |
| |
| \verbatim |
| Example: MODE_1_2_2_1 - WAV-Channelorder 5.1 |
| ------------------------------------------ |
| Input Channel | Coder Channel |
| --------------------+--------------------- |
| 2 (front center) | 0 (SCE channel) |
| 0 (left center) | 1 (1st of 1st CPE) |
| 1 (right center) | 2 (2nd of 1st CPE) |
| 4 (left surround) | 3 (1st of 2nd CPE) |
| 5 (right surround) | 4 (2nd of 2nd CPE) |
| 3 (LFE) | 5 (LFE) |
| ------------------------------------------ |
| \endverbatim |
| |
| |
| \section suppBitrates Supported Bitrates |
| |
| The FDK AAC Encoder provides a wide range of supported bitrates. |
| The minimum and maximum allowed bitrate depends on the Audio Object Type. For |
| AAC-LC the minimum bitrate is the bitrate that is required to write the most |
| basic and minimal valid bitstream. It consists of the bitstream format header |
| information and other static/mandatory information within the AAC payload. The |
| maximum AAC framesize allowed by the MPEG-4 standard determines the maximum |
| allowed bitrate for AAC-LC. For HE-AAC and HE-AAC v2 a library internal look-up |
| table is used. |
| |
| A good working point in terms of audio quality, sampling rate and bitrate, is at |
| 1 to 1.5 bits/audio sample for AAC-LC, 0.625 bits/audio sample for dualrate |
| HE-AAC, 1.125 bits/audio sample for downsampled HE-AAC and 0.5 bits/audio sample |
| for HE-AAC v2. For example for one channel with a sampling frequency of 48 kHz, |
| the range from 48 kbit/s to 72 kbit/s achieves reasonable audio quality for |
| AAC-LC. |
| |
| For HE-AAC and HE-AAC v2 the lowest possible audio input sampling frequency is |
| 16 kHz because then the AAC-LC core encoder operates in dual rate mode at its |
| lowest possible sampling frequency, which is 8 kHz. HE-AAC v2 requires stereo |
| input audio data. |
| |
| Please note that in HE-AAC or HE-AAC v2 mode the encoder supports much higher |
| bitrates than are appropriate for HE-AAC or HE-AAC v2. For example, at a bitrate |
| of more than 64 kbit/s for a stereo audio signal at 44.1 kHz it usually makes |
| sense to use AAC-LC, which will produce better audio quality at that bitrate |
| than HE-AAC or HE-AAC v2. |
| |
| \section reommendedConfig Recommended Sampling Rate and Bitrate Combinations |
| |
| The following table provides an overview of recommended encoder configuration |
| parameters which we determined by virtue of numerous listening tests. |
| |
| \subsection reommendedConfigLC AAC-LC, HE-AAC, HE-AACv2 in Dualrate SBR mode. |
| \verbatim |
| ----------------------------------------------------------------------------------- |
| Audio Object Type | Bit Rate Range | Supported | Preferred | No. |
| of | [bit/s] | Sampling Rates | Sampl. | Chan. | |
| | [kHz] | Rate | | | |
| | [kHz] | |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC + SBR + PS | 8000 - 11999 | 22.05, 24.00 | 24.00 | 2 |
| AAC LC + SBR + PS | 12000 - 17999 | 32.00 | 32.00 | 2 |
| AAC LC + SBR + PS | 18000 - 39999 | 32.00, 44.10, 48.00 | 44.10 | 2 |
| AAC LC + SBR + PS | 40000 - 64000 | 32.00, 44.10, 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC + SBR | 8000 - 11999 | 22.05, 24.00 | 24.00 | 1 |
| AAC LC + SBR | 12000 - 17999 | 32.00 | 32.00 | 1 |
| AAC LC + SBR | 18000 - 39999 | 32.00, 44.10, 48.00 | 44.10 | 1 |
| AAC LC + SBR | 40000 - 64000 | 32.00, 44.10, 48.00 | 48.00 | 1 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC + SBR | 16000 - 27999 | 32.00, 44.10, 48.00 | 32.00 | 2 |
| AAC LC + SBR | 28000 - 63999 | 32.00, 44.10, 48.00 | 44.10 | 2 |
| AAC LC + SBR | 64000 - 128000 | 32.00, 44.10, 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC + SBR | 64000 - 69999 | 32.00, 44.10, 48.00 | 32.00 | |
| 5, 5.1 AAC LC + SBR | 70000 - 239999 | 32.00, 44.10, 48.00 | 44.10 |
| | 5, 5.1 AAC LC + SBR | 240000 - 319999 | 32.00, 44.10, 48.00 | |
| 48.00 | 5, 5.1 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC | 8000 - 15999 | 11.025, 12.00, 16.00 | 12.00 | 1 |
| AAC LC | 16000 - 23999 | 16.00 | 16.00 | 1 |
| AAC LC | 24000 - 31999 | 16.00, 22.05, 24.00 | 24.00 | 1 |
| AAC LC | 32000 - 55999 | 32.00 | 32.00 | 1 |
| AAC LC | 56000 - 160000 | 32.00, 44.10, 48.00 | 44.10 | 1 |
| AAC LC | 160001 - 288000 | 48.00 | 48.00 | 1 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC | 16000 - 23999 | 11.025, 12.00, 16.00 | 12.00 | 2 |
| AAC LC | 24000 - 31999 | 16.00 | 16.00 | 2 |
| AAC LC | 32000 - 39999 | 16.00, 22.05, 24.00 | 22.05 | 2 |
| AAC LC | 40000 - 95999 | 32.00 | 32.00 | 2 |
| AAC LC | 96000 - 111999 | 32.00, 44.10, 48.00 | 32.00 | 2 |
| AAC LC | 112000 - 320001 | 32.00, 44.10, 48.00 | 44.10 | 2 |
| AAC LC | 320002 - 576000 | 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| AAC LC | 160000 - 239999 | 32.00 | 32.00 | |
| 5, 5.1 AAC LC | 240000 - 279999 | 32.00, 44.10, 48.00 | 32.00 |
| | 5, 5.1 AAC LC | 280000 - 800000 | 32.00, 44.10, 48.00 | |
| 44.10 | 5, 5.1 |
| ----------------------------------------------------------------------------------- |
| \endverbatim \n |
| |
| \subsection reommendedConfigLD AAC-LD, AAC-ELD, AAC-ELD with SBR in Dualrate SBR |
| mode. Unlike to HE-AAC configuration the SBR is not covered by ELD audio object |
| type and needs to be enabled explicitly. Use ::AACENC_SBR_MODE to configure SBR |
| and its samplingrate ratio with ::AACENC_SBR_RATIO parameter. \verbatim |
| ----------------------------------------------------------------------------------- |
| Audio Object Type | Bit Rate Range | Supported | Preferred | No. |
| of | [bit/s] | Sampling Rates | Sampl. | Chan. | |
| | [kHz] | Rate | | | |
| | [kHz] | |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 18000 - 24999 | 32.00 - 44.10 | 32.00 | 1 |
| ELD + SBR | 25000 - 31999 | 32.00 - 48.00 | 32.00 | 1 |
| ELD + SBR | 32000 - 64000 | 32.00 - 48.00 | 48.00 | 1 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 32000 - 51999 | 32.00 - 48.00 | 44.10 | 2 |
| ELD + SBR | 52000 - 128000 | 32.00 - 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 78000 - 160000 | 32.00 - 48.00 | 48.00 | 3 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 104000 - 212000 | 32.00 - 48.00 | 48.00 | 4 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 130000 - 246000 | 32.00 - 48.00 | 48.00 | |
| 5, 5.1 |
| -------------------+------------------+-----------------------+------------+------- |
| LD, ELD | 16000 - 19999 | 16.00 - 24.00 | 16.00 | 1 |
| LD, ELD | 20000 - 39999 | 16.00 - 32.00 | 24.00 | 1 |
| LD, ELD | 40000 - 49999 | 22.05 - 32.00 | 32.00 | 1 |
| LD, ELD | 50000 - 61999 | 24.00 - 44.10 | 32.00 | 1 |
| LD, ELD | 62000 - 84999 | 32.00 - 48.00 | 44.10 | 1 |
| LD, ELD | 85000 - 192000 | 44.10 - 48.00 | 48.00 | 1 |
| -------------------+------------------+-----------------------+------------+------- |
| LD, ELD | 64000 - 75999 | 24.00 - 32.00 | 32.00 | 2 |
| LD, ELD | 76000 - 97999 | 24.00 - 44.10 | 32.00 | 2 |
| LD, ELD | 98000 - 135999 | 32.00 - 48.00 | 44.10 | 2 |
| LD, ELD | 136000 - 384000 | 44.10 - 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| LD, ELD | 96000 - 113999 | 24.00 - 32.00 | 32.00 | 3 |
| LD, ELD | 114000 - 146999 | 24.00 - 44.10 | 32.00 | 3 |
| LD, ELD | 147000 - 203999 | 32.00 - 48.00 | 44.10 | 3 |
| LD, ELD | 204000 - 576000 | 44.10 - 48.00 | 48.00 | 3 |
| -------------------+------------------+-----------------------+------------+------- |
| LD, ELD | 128000 - 151999 | 24.00 - 32.00 | 32.00 | 4 |
| LD, ELD | 152000 - 195999 | 24.00 - 44.10 | 32.00 | 4 |
| LD, ELD | 196000 - 271999 | 32.00 - 48.00 | 44.10 | 4 |
| LD, ELD | 272000 - 768000 | 44.10 - 48.00 | 48.00 | 4 |
| -------------------+------------------+-----------------------+------------+------- |
| LD, ELD | 160000 - 189999 | 24.00 - 32.00 | 32.00 | |
| 5, 5.1 LD, ELD | 190000 - 244999 | 24.00 - 44.10 | 32.00 |
| | 5, 5.1 LD, ELD | 245000 - 339999 | 32.00 - 48.00 | |
| 44.10 | 5, 5.1 LD, ELD | 340000 - 960000 | 44.10 - 48.00 | |
| 48.00 | 5, 5.1 |
| ----------------------------------------------------------------------------------- |
| \endverbatim \n |
| |
| \subsection reommendedConfigELD AAC-ELD with SBR in Downsampled SBR mode. |
| \verbatim |
| ----------------------------------------------------------------------------------- |
| Audio Object Type | Bit Rate Range | Supported | Preferred | No. |
| of | [bit/s] | Sampling Rates | Sampl. | Chan. | |
| | [kHz] | Rate | | | |
| | [kHz] | |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 18000 - 24999 | 16.00 - 22.05 | 22.05 | 1 |
| (downsampled SBR) | 25000 - 31999 | 16.00 - 24.00 | 24.00 | 1 |
| | 32000 - 47999 | 22.05 - 32.00 | 32.00 | 1 |
| | 48000 - 64000 | 22.05 - 48.00 | 32.00 | 1 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 32000 - 51999 | 16.00 - 24.00 | 24.00 | 2 |
| (downsampled SBR) | 52000 - 59999 | 22.05 - 24.00 | 24.00 | 2 |
| | 60000 - 95999 | 22.05 - 32.00 | 32.00 | 2 |
| | 96000 - 128000 | 22.05 - 48.00 | 32.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 78000 - 99999 | 22.05 - 24.00 | 24.00 | 3 |
| (downsampled SBR) | 100000 - 143999 | 22.05 - 32.00 | 32.00 | 3 |
| | 144000 - 159999 | 22.05 - 48.00 | 32.00 | 3 |
| | 160000 - 192000 | 32.00 - 48.00 | 32.00 | 3 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 104000 - 149999 | 22.05 - 24.00 | 24.00 | 4 |
| (downsampled SBR) | 150000 - 191999 | 22.05 - 32.00 | 32.00 | 4 |
| | 192000 - 211999 | 22.05 - 48.00 | 32.00 | 4 |
| | 212000 - 256000 | 32.00 - 48.00 | 32.00 | 4 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD + SBR | 130000 - 171999 | 22.05 - 24.00 | 24.00 | |
| 5, 5.1 (downsampled SBR) | 172000 - 239999 | 22.05 - 32.00 | 32.00 |
| | 5, 5.1 | 240000 - 320000 | 32.00 - 48.00 | 32.00 | 5, 5.1 |
| ----------------------------------------------------------------------------------- |
| \endverbatim \n |
| |
| \subsection reommendedConfigELDv2 AAC-ELD v2, AAC-ELD v2 with SBR. |
| The ELD v2 212 configuration must be configured explicitly with |
| ::AACENC_CHANNELMODE parameter according MODE_212 value. SBR can be configured |
| separately through ::AACENC_SBR_MODE and ::AACENC_SBR_RATIO parameter. Following |
| configurations shall apply to both framelengths 480 and 512. For ELD v2 |
| configuration without SBR and framelength 480 the supported sampling rate is |
| restricted to the range from 16 kHz up to 24 kHz. \verbatim |
| ----------------------------------------------------------------------------------- |
| Audio Object Type | Bit Rate Range | Supported | Preferred | No. |
| of | [bit/s] | Sampling Rates | Sampl. | Chan. | |
| | [kHz] | Rate | | | |
| | [kHz] | |
| -------------------+------------------+-----------------------+------------+------- |
| ELD-212 | 16000 - 19999 | 16.00 - 24.00 | 16.00 | 2 |
| (without SBR) | 20000 - 39999 | 16.00 - 32.00 | 24.00 | 2 |
| | 40000 - 49999 | 22.05 - 32.00 | 32.00 | 2 |
| | 50000 - 61999 | 24.00 - 44.10 | 32.00 | 2 |
| | 62000 - 84999 | 32.00 - 48.00 | 44.10 | 2 |
| | 85000 - 192000 | 44.10 - 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD-212 + SBR | 18000 - 20999 | 32.00 | 32.00 | 2 |
| (dualrate SBR) | 21000 - 25999 | 32.00 - 44.10 | 32.00 | 2 |
| | 26000 - 31999 | 32.00 - 48.00 | 44.10 | 2 |
| | 32000 - 64000 | 32.00 - 48.00 | 48.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| ELD-212 + SBR | 18000 - 19999 | 16.00 - 22.05 | 22.05 | 2 |
| (downsampled SBR) | 20000 - 24999 | 16.00 - 24.00 | 22.05 | 2 |
| | 25000 - 31999 | 16.00 - 24.00 | 24.00 | 2 |
| | 32000 - 64000 | 24.00 - 24.00 | 24.00 | 2 |
| -------------------+------------------+-----------------------+------------+------- |
| \endverbatim \n |
| |
| \page ENCODERBEHAVIOUR Encoder Behaviour |
| |
| \section BEHAVIOUR_BANDWIDTH Bandwidth |
| |
| The FDK AAC encoder usually does not use the full frequency range of the input |
| signal, but restricts the bandwidth according to certain library-internal |
| settings. They can be changed in the table "bandWidthTable" in the file |
| bandwidth.cpp (if available). |
| |
| The encoder API provides the ::AACENC_BANDWIDTH parameter to adjust the |
| bandwidth explicitly. \code aacEncoder_SetParam(hAacEncoder, AACENC_BANDWIDTH, |
| value); \endcode |
| |
| However it is not recommended to change these settings, because they are based |
| on numerous listening tests and careful tweaks to ensure the best overall |
| encoding quality. Also, the maximum bandwidth that can be set manually by the |
| user is 20kHz or fs/2, whichever value is smaller. |
| |
| Theoretically a signal of for example 48 kHz can contain frequencies up to 24 |
| kHz, but to use this full range in an audio encoder usually does not make sense. |
| Usually the encoder has a very limited amount of bits to spend (typically 128 |
| kbit/s for stereo 48 kHz content) and to allow full range bandwidth would waste |
| a lot of these bits for frequencies the human ear is hardly able to perceive |
| anyway, if at all. Hence it is wise to use the available bits for the really |
| important frequency range and just skip the rest. At lower bitrates (e. g. <= 80 |
| kbit/s for stereo 48 kHz content) the encoder will choose an even smaller |
| bandwidth, because an encoded signal with smaller bandwidth and hence less |
| artifacts sounds better than a signal with higher bandwidth but then more coding |
| artefacts across all frequencies. These artefacts would occur if small bitrates |
| and high bandwidths are chosen because the available bits are just not enough to |
| encode all frequencies well. |
| |
| Unfortunately some people evaluate encoding quality based on possible bandwidth |
| as well, but it is a double-edged sword considering the trade-off described |
| above. |
| |
| Another aspect is workload consumption. The higher the allowed bandwidth, the |
| more frequency lines have to be processed, which in turn increases the workload. |
| |
| \section FRAMESIZES_AND_BIT_RESERVOIR Frame Sizes & Bit Reservoir |
| |
| For AAC there is a difference between constant bit rate and constant frame |
| length due to the so-called bit reservoir technique, which allows the encoder to |
| use less bits in an AAC frame for those audio signal sections which are easy to |
| encode, and then spend them at a later point in time for more complex audio |
| sections. The extent to which this "bit exchange" is done is limited to allow |
| for reliable and relatively low delay real time streaming. Therefore, for |
| AAC-ELD, the bitreservoir is limited. It varies between 500 and 4000 bits/frame, |
| depending on the bitrate/channel. |
| - For a bitrate of 12kbps/channel and below, the AAC-ELD bitreservoir is 500 |
| bits/frame. |
| - For a bitrate of 70kbps/channel and above, the AAC-ELD bitreservoir is 4000 |
| bits/frame. |
| - Between 12kbps/channel and 70kbps/channel, the AAC-ELD bitrervoir is increased |
| linearly. |
| - For AAC-LC, the bitrate is only limited by the maximum AAC frame length. It |
| is, regardless of the available bit reservoir, defined as 6144 bits per channel. |
| |
| Over a longer period in time the bitrate will be constant in the AAC constant |
| bitrate mode, e.g. for ISDN transmission. This means that in AAC each bitstream |
| frame will in general have a different length in bytes but over time it |
| will reach the target bitrate. |
| |
| |
| One could also make an MPEG compliant |
| AAC encoder which always produces constant length packages for each AAC frame, |
| but the audio quality would be considerably worse since the bit reservoir |
| technique would have to be switched off completely. A higher bit rate would have |
| to be used to get the same audio quality as with an enabled bit reservoir. |
| |
| For mp3 by the way, the same bit reservoir technique exists, but there each bit |
| stream frame has a constant length for a given bit rate (ignoring the |
| padding byte). In mp3 there is a so-called "back pointer" which tells |
| the decoder which bits belong to the current mp3 frame - and in general some or |
| many bits have been transmitted in an earlier mp3 frame. Basically this leads to |
| the same "bit exchange between mp3 frames" as in AAC but with virtually constant |
| length frames. |
| |
| This variable frame length at "constant bit rate" is not something special |
| in this Fraunhofer IIS AAC encoder. AAC has been designed in that way. |
| |
| \subsection BEHAVIOUR_ESTIM_AVG_FRAMESIZES Estimating Average Frame Sizes |
| |
| A HE-AAC v1 or v2 audio frame contains 2048 PCM samples per channel. |
| |
| The number of HE-AAC frames \f$N\_FRAMES\f$ per second at 44.1 kHz is: |
| |
| \f[ |
| N\_FRAMES = 44100 / 2048 = 21.5332 |
| \f] |
| |
| At a bit rate of 8 kbps the average number of bits per frame |
| \f$N\_BITS\_PER\_FRAME\f$ is: |
| |
| \f[ |
| N\_BITS\_PER\_FRAME = 8000 / 21.5332 = 371.52 |
| \f] |
| |
| which is about 46.44 bytes per encoded frame. |
| |
| At a bit rate of 32 kbps, which is quite high for single channel HE-AAC v1, it |
| is: |
| |
| \f[ |
| N\_BITS\_PER\_FRAME = 32000 / 21.5332 = 1486 |
| \f] |
| |
| which is about 185.76 bytes per encoded frame. |
| |
| These bits/frame figures are average figures where each AAC frame generally has |
| a different size in bytes. To calculate the same for AAC-LC just use 1024 |
| instead of 2048 PCM samples per frame and channel. For AAC-LD/ELD it is either |
| 480 or 512 PCM samples per frame and channel. |
| |
| |
| \section BEHAVIOUR_TOOLS Encoder Tools |
| |
| The AAC encoder supports TNS, PNS, MS, Intensity and activates these tools |
| depending on the audio signal and the encoder configuration (i.e. bitrate or |
| AOT). It is not required to configure these tools manually. |
| |
| PNS improves encoding quality only for certain bitrates. Therefore it makes |
| sense to activate PNS only for these bitrates and save the processing power |
| required for PNS (about 10 % of the encoder) when using other bitrates. This is |
| done automatically inside the encoder library. PNS is disabled inside the |
| encoder library if an MPEG-2 AOT is choosen since PNS is an MPEG-4 AAC feature. |
| |
| If SBR is activated, the encoder automatically deactivates PNS internally. If |
| TNS is disabled but PNS is allowed, the encoder deactivates PNS calculation |
| internally. |
| |
| */ |
| |
| #ifndef AACENC_LIB_H |
| #define AACENC_LIB_H |
| |
| #include "machine_type.h" |
| #include "FDK_audio.h" |
| |
| /** |
| * AAC encoder error codes. |
| */ |
| typedef enum { |
| AACENC_OK = 0x0000, /*!< No error happened. All fine. */ |
| |
| AACENC_INVALID_HANDLE = |
| 0x0020, /*!< Handle passed to function call was invalid. */ |
| AACENC_MEMORY_ERROR = 0x0021, /*!< Memory allocation failed. */ |
| AACENC_UNSUPPORTED_PARAMETER = 0x0022, /*!< Parameter not available. */ |
| AACENC_INVALID_CONFIG = 0x0023, /*!< Configuration not provided. */ |
| |
| AACENC_INIT_ERROR = 0x0040, /*!< General initialization error. */ |
| AACENC_INIT_AAC_ERROR = 0x0041, /*!< AAC library initialization error. */ |
| AACENC_INIT_SBR_ERROR = 0x0042, /*!< SBR library initialization error. */ |
| AACENC_INIT_TP_ERROR = 0x0043, /*!< Transport library initialization error. */ |
| AACENC_INIT_META_ERROR = |
| 0x0044, /*!< Meta data library initialization error. */ |
| AACENC_INIT_MPS_ERROR = 0x0045, /*!< MPS library initialization error. */ |
| |
| AACENC_ENCODE_ERROR = 0x0060, /*!< The encoding process was interrupted by an |
| unexpected error. */ |
| |
| AACENC_ENCODE_EOF = 0x0080 /*!< End of file reached. */ |
| |
| } AACENC_ERROR; |
| |
| /** |
| * AAC encoder buffer descriptors identifier. |
| * This identifier are used within buffer descriptors |
| * AACENC_BufDesc::bufferIdentifiers. |
| */ |
| typedef enum { |
| /* Input buffer identifier. */ |
| IN_AUDIO_DATA = 0, /*!< Audio input buffer, interleaved INT_PCM samples. */ |
| IN_ANCILLRY_DATA = 1, /*!< Ancillary data to be embedded into bitstream. */ |
| IN_METADATA_SETUP = 2, /*!< Setup structure for embedding meta data. */ |
| |
| /* Output buffer identifier. */ |
| OUT_BITSTREAM_DATA = 3, /*!< Buffer holds bitstream output data. */ |
| OUT_AU_SIZES = |
| 4 /*!< Buffer contains sizes of each access unit. This information |
| is necessary for superframing. */ |
| |
| } AACENC_BufferIdentifier; |
| |
| /** |
| * AAC encoder handle. |
| */ |
| typedef struct AACENCODER *HANDLE_AACENCODER; |
| |
| /** |
| * Provides some info about the encoder configuration. |
| */ |
| typedef struct { |
| UINT maxOutBufBytes; /*!< Maximum number of encoder bitstream bytes within one |
| frame. Size depends on maximum number of supported |
| channels in encoder instance. */ |
| |
| UINT maxAncBytes; /*!< Maximum number of ancillary data bytes which can be |
| inserted into bitstream within one frame. */ |
| |
| UINT inBufFillLevel; /*!< Internal input buffer fill level in samples per |
| channel. This parameter will automatically be cleared |
| if samplingrate or channel(Mode/Order) changes. */ |
| |
| UINT inputChannels; /*!< Number of input channels expected in encoding |
| process. */ |
| |
| UINT frameLength; /*!< Amount of input audio samples consumed each frame per |
| channel, depending on audio object type configuration. */ |
| |
| UINT nDelay; /*!< Codec delay in PCM samples/channel. Depends on framelength |
| and AOT. Does not include framing delay for filling up encoder |
| PCM input buffer. */ |
| |
| UINT nDelayCore; /*!< Codec delay in PCM samples/channel, w/o delay caused by |
| the decoder SBR module. This delay is needed to correctly |
| write edit lists for gapless playback. The decoder may not |
| know how much delay is introdcued by SBR, since it may not |
| know if SBR is active at all (implicit signaling), |
| therefore the deocder must take into account any delay |
| caused by the SBR module. */ |
| |
| UCHAR confBuf[64]; /*!< Configuration buffer in binary format as an |
| AudioSpecificConfig or StreamMuxConfig according to the |
| selected transport type. */ |
| |
| UINT confSize; /*!< Number of valid bytes in confBuf. */ |
| |
| } AACENC_InfoStruct; |
| |
| /** |
| * Describes the input and output buffers for an aacEncEncode() call. |
| */ |
| typedef struct { |
| INT numBufs; /*!< Number of buffers. */ |
| void **bufs; /*!< Pointer to vector containing buffer addresses. */ |
| INT *bufferIdentifiers; /*!< Identifier of each buffer element. See |
| ::AACENC_BufferIdentifier. */ |
| INT *bufSizes; /*!< Size of each buffer in 8-bit bytes. */ |
| INT *bufElSizes; /*!< Size of each buffer element in bytes. */ |
| |
| } AACENC_BufDesc; |
| |
| /** |
| * Defines the input arguments for an aacEncEncode() call. |
| */ |
| typedef struct { |
| INT numInSamples; /*!< Number of valid input audio samples (multiple of input |
| channels). */ |
| INT numAncBytes; /*!< Number of ancillary data bytes to be encoded. */ |
| |
| } AACENC_InArgs; |
| |
| /** |
| * Defines the output arguments for an aacEncEncode() call. |
| */ |
| typedef struct { |
| INT numOutBytes; /*!< Number of valid bitstream bytes generated during |
| aacEncEncode(). */ |
| INT numInSamples; /*!< Number of input audio samples consumed by the encoder. |
| */ |
| INT numAncBytes; /*!< Number of ancillary data bytes consumed by the encoder. |
| */ |
| INT bitResState; /*!< State of the bit reservoir in bits. */ |
| |
| } AACENC_OutArgs; |
| |
| /** |
| * Meta Data Compression Profiles. |
| */ |
| typedef enum { |
| AACENC_METADATA_DRC_NONE = 0, /*!< None. */ |
| AACENC_METADATA_DRC_FILMSTANDARD = 1, /*!< Film standard. */ |
| AACENC_METADATA_DRC_FILMLIGHT = 2, /*!< Film light. */ |
| AACENC_METADATA_DRC_MUSICSTANDARD = 3, /*!< Music standard. */ |
| AACENC_METADATA_DRC_MUSICLIGHT = 4, /*!< Music light. */ |
| AACENC_METADATA_DRC_SPEECH = 5, /*!< Speech. */ |
| AACENC_METADATA_DRC_NOT_PRESENT = |
| 256 /*!< Disable writing gain factor (used for comp_profile only). */ |
| |
| } AACENC_METADATA_DRC_PROFILE; |
| |
| /** |
| * Meta Data setup structure. |
| */ |
| typedef struct { |
| AACENC_METADATA_DRC_PROFILE |
| drc_profile; /*!< MPEG DRC compression profile. See |
| ::AACENC_METADATA_DRC_PROFILE. */ |
| AACENC_METADATA_DRC_PROFILE |
| comp_profile; /*!< ETSI heavy compression profile. See |
| ::AACENC_METADATA_DRC_PROFILE. */ |
| |
| INT drc_TargetRefLevel; /*!< Used to define expected level to: |
| Scaled with 16 bit. x*2^16. */ |
| INT comp_TargetRefLevel; /*!< Adjust limiter to avoid overload. |
| Scaled with 16 bit. x*2^16. */ |
| |
| INT prog_ref_level_present; /*!< Flag, if prog_ref_level is present */ |
| INT prog_ref_level; /*!< Programme Reference Level = Dialogue Level: |
| -31.75dB .. 0 dB ; stepsize: 0.25dB |
| Scaled with 16 bit. x*2^16.*/ |
| |
| UCHAR PCE_mixdown_idx_present; /*!< Flag, if dmx-idx should be written in |
| programme config element */ |
| UCHAR ETSI_DmxLvl_present; /*!< Flag, if dmx-lvl should be written in |
| ETSI-ancData */ |
| |
| SCHAR centerMixLevel; /*!< Center downmix level (0...7, according to table) */ |
| SCHAR surroundMixLevel; /*!< Surround downmix level (0...7, according to |
| table) */ |
| |
| UCHAR |
| dolbySurroundMode; /*!< Indication for Dolby Surround Encoding Mode. |
| - 0: Dolby Surround mode not indicated |
| - 1: 2-ch audio part is not Dolby surround encoded |
| - 2: 2-ch audio part is Dolby surround encoded */ |
| |
| UCHAR drcPresentationMode; /*!< Indicatin for DRC Presentation Mode. |
| - 0: Presentation mode not inticated |
| - 1: Presentation mode 1 |
| - 2: Presentation mode 2 */ |
| |
| struct { |
| /* extended ancillary data */ |
| UCHAR extAncDataEnable; /*< Indicates if MPEG4_ext_ancillary_data() exists. |
| - 0: No MPEG4_ext_ancillary_data(). |
| - 1: Insert MPEG4_ext_ancillary_data(). */ |
| |
| UCHAR |
| extDownmixLevelEnable; /*< Indicates if ext_downmixing_levels() exists. |
| - 0: No ext_downmixing_levels(). |
| - 1: Insert ext_downmixing_levels(). */ |
| UCHAR extDownmixLevel_A; /*< Downmix level index A (0...7, according to |
| table) */ |
| UCHAR extDownmixLevel_B; /*< Downmix level index B (0...7, according to |
| table) */ |
| |
| UCHAR dmxGainEnable; /*< Indicates if ext_downmixing_global_gains() exists. |
| - 0: No ext_downmixing_global_gains(). |
| - 1: Insert ext_downmixing_global_gains(). */ |
| INT dmxGain5; /*< Gain factor for downmix to 5 channels. |
| -15.75dB .. -15.75dB; stepsize: 0.25dB |
| Scaled with 16 bit. x*2^16.*/ |
| INT dmxGain2; /*< Gain factor for downmix to 2 channels. |
| -15.75dB .. -15.75dB; stepsize: 0.25dB |
| Scaled with 16 bit. x*2^16.*/ |
| |
| UCHAR lfeDmxEnable; /*< Indicates if ext_downmixing_lfe_level() exists. |
| - 0: No ext_downmixing_lfe_level(). |
| - 1: Insert ext_downmixing_lfe_level(). */ |
| UCHAR lfeDmxLevel; /*< Downmix level index for LFE (0..15, according to |
| table) */ |
| |
| } ExtMetaData; |
| |
| } AACENC_MetaData; |
| |
| /** |
| * AAC encoder control flags. |
| * |
| * In interaction with the ::AACENC_CONTROL_STATE parameter it is possible to |
| * get information about the internal initialization process. It is also |
| * possible to overwrite the internal state from extern when necessary. |
| */ |
| typedef enum { |
| AACENC_INIT_NONE = 0x0000, /*!< Do not trigger initialization. */ |
| AACENC_INIT_CONFIG = |
| 0x0001, /*!< Initialize all encoder modules configuration. */ |
| AACENC_INIT_STATES = 0x0002, /*!< Reset all encoder modules history buffer. */ |
| AACENC_INIT_TRANSPORT = |
| 0x1000, /*!< Initialize transport lib with new parameters. */ |
| AACENC_RESET_INBUFFER = |
| 0x2000, /*!< Reset fill level of internal input buffer. */ |
| AACENC_INIT_ALL = 0xFFFF /*!< Initialize all. */ |
| } AACENC_CTRLFLAGS; |
| |
| /** |
| * \brief AAC encoder setting parameters. |
| * |
| * Use aacEncoder_SetParam() function to configure, or use aacEncoder_GetParam() |
| * function to read the internal status of the following parameters. |
| */ |
| typedef enum { |
| AACENC_AOT = |
| 0x0100, /*!< Audio object type. See ::AUDIO_OBJECT_TYPE in FDK_audio.h. |
| - 2: MPEG-4 AAC Low Complexity. |
| - 5: MPEG-4 AAC Low Complexity with Spectral Band Replication |
| (HE-AAC). |
| - 29: MPEG-4 AAC Low Complexity with Spectral Band |
| Replication and Parametric Stereo (HE-AAC v2). This |
| configuration can be used only with stereo input audio data. |
| - 23: MPEG-4 AAC Low-Delay. |
| - 39: MPEG-4 AAC Enhanced Low-Delay. Since there is no |
| ::AUDIO_OBJECT_TYPE for ELD in combination with SBR defined, |
| enable SBR explicitely by ::AACENC_SBR_MODE parameter. The ELD |
| v2 212 configuration can be configured by ::AACENC_CHANNELMODE |
| parameter. |
| - 129: MPEG-2 AAC Low Complexity. |
| - 132: MPEG-2 AAC Low Complexity with Spectral Band |
| Replication (HE-AAC). |
| |
| Please note that the virtual MPEG-2 AOT's basically disables |
| non-existing Perceptual Noise Substitution tool in AAC encoder |
| and controls the MPEG_ID flag in adts header. The virtual |
| MPEG-2 AOT doesn't prohibit specific transport formats. */ |
| |
| AACENC_BITRATE = 0x0101, /*!< Total encoder bitrate. This parameter is |
| mandatory and interacts with ::AACENC_BITRATEMODE. |
| - CBR: Bitrate in bits/second. |
| - VBR: Variable bitrate. Bitrate argument will |
| be ignored. See \ref suppBitrates for details. */ |
| |
| AACENC_BITRATEMODE = 0x0102, /*!< Bitrate mode. Configuration can be different |
| kind of bitrate configurations: |
| - 0: Constant bitrate, use bitrate according |
| to ::AACENC_BITRATE. (default) Within none |
| LD/ELD ::AUDIO_OBJECT_TYPE, the CBR mode makes |
| use of full allowed bitreservoir. In contrast, |
| at Low-Delay ::AUDIO_OBJECT_TYPE the |
| bitreservoir is kept very small. |
| - 1: Variable bitrate mode, \ref vbrmode |
| "very low bitrate". |
| - 2: Variable bitrate mode, \ref vbrmode |
| "low bitrate". |
| - 3: Variable bitrate mode, \ref vbrmode |
| "medium bitrate". |
| - 4: Variable bitrate mode, \ref vbrmode |
| "high bitrate". |
| - 5: Variable bitrate mode, \ref vbrmode |
| "very high bitrate". */ |
| |
| AACENC_SAMPLERATE = 0x0103, /*!< Audio input data sampling rate. Encoder |
| supports following sampling rates: 8000, 11025, |
| 12000, 16000, 22050, 24000, 32000, 44100, |
| 48000, 64000, 88200, 96000 */ |
| |
| AACENC_SBR_MODE = 0x0104, /*!< Configure SBR independently of the chosen Audio |
| Object Type ::AUDIO_OBJECT_TYPE. This parameter |
| is for ELD audio object type only. |
| - -1: Use ELD SBR auto configurator (default). |
| - 0: Disable Spectral Band Replication. |
| - 1: Enable Spectral Band Replication. */ |
| |
| AACENC_GRANULE_LENGTH = |
| 0x0105, /*!< Core encoder (AAC) audio frame length in samples: |
| - 1024: Default configuration. |
| - 512: Default length in LD/ELD configuration. |
| - 480: Length in LD/ELD configuration. |
| - 256: Length for ELD reduced delay mode (x2). |
| - 240: Length for ELD reduced delay mode (x2). |
| - 128: Length for ELD reduced delay mode (x4). |
| - 120: Length for ELD reduced delay mode (x4). */ |
| |
| AACENC_CHANNELMODE = 0x0106, /*!< Set explicit channel mode. Channel mode must |
| match with number of input channels. |
| - 1-7, 11,12,14 and 33,34: MPEG channel |
| modes supported, see ::CHANNEL_MODE in |
| FDK_audio.h. */ |
| |
| AACENC_CHANNELORDER = |
| 0x0107, /*!< Input audio data channel ordering scheme: |
| - 0: MPEG channel ordering (e. g. 5.1: C, L, R, SL, SR, LFE). |
| (default) |
| - 1: WAVE file format channel ordering (e. g. 5.1: L, R, C, |
| LFE, SL, SR). */ |
| |
| AACENC_SBR_RATIO = |
| 0x0108, /*!< Controls activation of downsampled SBR. With downsampled |
| SBR, the delay will be shorter. On the other hand, for |
| achieving the same quality level, downsampled SBR needs more |
| bits than dual-rate SBR. With downsampled SBR, the AAC encoder |
| will work at the same sampling rate as the SBR encoder (single |
| rate). Downsampled SBR is supported for AAC-ELD and HE-AACv1. |
| - 1: Downsampled SBR (default for ELD). |
| - 2: Dual-rate SBR (default for HE-AAC). */ |
| |
| AACENC_AFTERBURNER = |
| 0x0200, /*!< This parameter controls the use of the afterburner feature. |
| The afterburner is a type of analysis by synthesis algorithm |
| which increases the audio quality but also the required |
| processing power. It is recommended to always activate this if |
| additional memory consumption and processing power consumption |
| is not a problem. If increased MHz and memory consumption are |
| an issue then the MHz and memory cost of this optional module |
| need to be evaluated against the improvement in audio quality |
| on a case by case basis. |
| - 0: Disable afterburner (default). |
| - 1: Enable afterburner. */ |
| |
| AACENC_BANDWIDTH = 0x0203, /*!< Core encoder audio bandwidth: |
| - 0: Determine audio bandwidth internally |
| (default, see chapter \ref BEHAVIOUR_BANDWIDTH). |
| - 1 to fs/2: Audio bandwidth in Hertz. Limited |
| to 20kHz max. Not usable if SBR is active. This |
| setting is for experts only, better do not touch |
| this value to avoid degraded audio quality. */ |
| |
| AACENC_PEAK_BITRATE = |
| 0x0207, /*!< Peak bitrate configuration parameter to adjust maximum bits |
| per audio frame. Bitrate is in bits/second. The peak bitrate |
| will internally be limited to the chosen bitrate |
| ::AACENC_BITRATE as lower limit and the |
| number_of_effective_channels*6144 bit as upper limit. |
| |
| Setting the peak bitrate equal to ::AACENC_BITRATE does not |
| necessarily mean that the audio frames will be of constant |
| size. Since the peak bitate is in bits/second, the frame sizes |
| can vary by one byte in one or the other direction over various |
| frames. However, it is not recommended to reduce the peak |
| pitrate to ::AACENC_BITRATE - it would disable the |
| bitreservoir, which would affect the audio quality by a large |
| amount. */ |
| |
| AACENC_TRANSMUX = 0x0300, /*!< Transport type to be used. See ::TRANSPORT_TYPE |
| in FDK_audio.h. Following types can be configured |
| in encoder library: |
| - 0: raw access units |
| - 1: ADIF bitstream format |
| - 2: ADTS bitstream format |
| - 6: Audio Mux Elements (LATM) with |
| muxConfigPresent = 1 |
| - 7: Audio Mux Elements (LATM) with |
| muxConfigPresent = 0, out of band StreamMuxConfig |
| - 10: Audio Sync Stream (LOAS) */ |
| |
| AACENC_HEADER_PERIOD = |
| 0x0301, /*!< Frame count period for sending in-band configuration buffers |
| within LATM/LOAS transport layer. Additionally this parameter |
| configures the PCE repetition period in raw_data_block(). See |
| \ref encPCE. |
| - 0xFF: auto-mode default 10 for TT_MP4_ADTS, TT_MP4_LOAS and |
| TT_MP4_LATM_MCP1, otherwise 0. |
| - n: Frame count period. */ |
| |
| AACENC_SIGNALING_MODE = |
| 0x0302, /*!< Signaling mode of the extension AOT: |
| - 0: Implicit backward compatible signaling (default for |
| non-MPEG-4 based AOT's and for the transport formats ADIF and |
| ADTS) |
| - A stream that uses implicit signaling can be decoded |
| by every AAC decoder, even AAC-LC-only decoders |
| - An AAC-LC-only decoder will only decode the |
| low-frequency part of the stream, resulting in a band-limited |
| output |
| - This method works with all transport formats |
| - This method does not work with downsampled SBR |
| - 1: Explicit backward compatible signaling |
| - A stream that uses explicit backward compatible |
| signaling can be decoded by every AAC decoder, even AAC-LC-only |
| decoders |
| - An AAC-LC-only decoder will only decode the |
| low-frequency part of the stream, resulting in a band-limited |
| output |
| - A decoder not capable of decoding PS will only decode |
| the AAC-LC+SBR part. If the stream contained PS, the result |
| will be a a decoded mono downmix |
| - This method does not work with ADIF or ADTS. For |
| LOAS/LATM, it only works with AudioMuxVersion==1 |
| - This method does work with downsampled SBR |
| - 2: Explicit hierarchical signaling (default for MPEG-4 |
| based AOT's and for all transport formats excluding ADIF and |
| ADTS) |
| - A stream that uses explicit hierarchical signaling can |
| be decoded only by HE-AAC decoders |
| - An AAC-LC-only decoder will not decode a stream that |
| uses explicit hierarchical signaling |
| - A decoder not capable of decoding PS will not decode |
| the stream at all if it contained PS |
| - This method does not work with ADIF or ADTS. It works |
| with LOAS/LATM and the MPEG-4 File format |
| - This method does work with downsampled SBR |
| |
| For making sure that the listener always experiences the |
| best audio quality, explicit hierarchical signaling should be |
| used. This makes sure that only a full HE-AAC-capable decoder |
| will decode those streams. The audio is played at full |
| bandwidth. For best backwards compatibility, it is recommended |
| to encode with implicit SBR signaling. A decoder capable of |
| AAC-LC only will then only decode the AAC part, which means the |
| decoded audio will sound band-limited. |
| |
| For MPEG-2 transport types (ADTS,ADIF), only implicit |
| signaling is possible. |
| |
| For LOAS and LATM, explicit backwards compatible signaling |
| only works together with AudioMuxVersion==1. The reason is |
| that, for explicit backwards compatible signaling, additional |
| information will be appended to the ASC. A decoder that is only |
| capable of decoding AAC-LC will skip this part. Nevertheless, |
| for jumping to the end of the ASC, it needs to know the ASC |
| length. Transmitting the length of the ASC is a feature of |
| AudioMuxVersion==1, it is not possible to transmit the length |
| of the ASC with AudioMuxVersion==0, therefore an AAC-LC-only |
| decoder will not be able to parse a LOAS/LATM stream that was |
| being encoded with AudioMuxVersion==0. |
| |
| For downsampled SBR, explicit signaling is mandatory. The |
| reason for this is that the extension sampling frequency (which |
| is in case of SBR the sampling frequqncy of the SBR part) can |
| only be signaled in explicit mode. |
| |
| For AAC-ELD, the SBR information is transmitted in the |
| ELDSpecific Config, which is part of the AudioSpecificConfig. |
| Therefore, the settings here will have no effect on AAC-ELD.*/ |
| |
| AACENC_TPSUBFRAMES = |
| 0x0303, /*!< Number of sub frames in a transport frame for LOAS/LATM or |
| ADTS (default 1). |
| - ADTS: Maximum number of sub frames restricted to 4. |
| - LOAS/LATM: Maximum number of sub frames restricted to 2.*/ |
| |
| AACENC_AUDIOMUXVER = |
| 0x0304, /*!< AudioMuxVersion to be used for LATM. (AudioMuxVersionA, |
| currently not implemented): |
| - 0: Default, no transmission of tara Buffer fullness, no ASC |
| length and including actual latm Buffer fullnes. |
| - 1: Transmission of tara Buffer fullness, ASC length and |
| actual latm Buffer fullness. |
| - 2: Transmission of tara Buffer fullness, ASC length and |
| maximum level of latm Buffer fullness. */ |
| |
| AACENC_PROTECTION = 0x0306, /*!< Configure protection in transport layer: |
| - 0: No protection. (default) |
| - 1: CRC active for ADTS transport format. */ |
| |
| AACENC_ANCILLARY_BITRATE = |
| 0x0500, /*!< Constant ancillary data bitrate in bits/second. |
| - 0: Either no ancillary data or insert exact number of |
| bytes, denoted via input parameter, numAncBytes in |
| AACENC_InArgs. |
| - else: Insert ancillary data with specified bitrate. */ |
| |
| AACENC_METADATA_MODE = 0x0600, /*!< Configure Meta Data. See ::AACENC_MetaData |
| for further details: |
| - 0: Do not embed any metadata. |
| - 1: Embed dynamic_range_info metadata. |
| - 2: Embed dynamic_range_info and |
| ancillary_data metadata. |
| - 3: Embed ancillary_data metadata. */ |
| |
| AACENC_CONTROL_STATE = |
| 0xFF00, /*!< There is an automatic process which internally reconfigures |
| the encoder instance when a configuration parameter changed or |
| an error occured. This paramerter allows overwriting or getting |
| the control status of this process. See ::AACENC_CTRLFLAGS. */ |
| |
| AACENC_NONE = 0xFFFF /*!< ------ */ |
| |
| } AACENC_PARAM; |
| |
| #ifdef __cplusplus |
| extern "C" { |
| #endif |
| |
| /** |
| * \brief Open an instance of the encoder. |
| * |
| * Allocate memory for an encoder instance with a functional range denoted by |
| * the function parameters. Preinitialize encoder instance with default |
| * configuration. |
| * |
| * \param phAacEncoder A pointer to an encoder handle. Initialized on return. |
| * \param encModules Specify encoder modules to be supported in this encoder |
| * instance: |
| * - 0x0: Allocate memory for all available encoder |
| * modules. |
| * - else: Select memory allocation regarding encoder |
| * modules. Following flags are possible and can be combined. |
| * - 0x01: AAC module. |
| * - 0x02: SBR module. |
| * - 0x04: PS module. |
| * - 0x08: MPS module. |
| * - 0x10: Metadata module. |
| * - example: (0x01|0x02|0x04|0x08|0x10) allocates |
| * all modules and is equivalent to default configuration denotet by 0x0. |
| * \param maxChannels Number of channels to be allocated. This parameter can |
| * be used in different ways: |
| * - 0: Allocate maximum number of AAC and SBR channels as |
| * supported by the library. |
| * - nChannels: Use same maximum number of channels for |
| * allocating memory in AAC and SBR module. |
| * - nChannels | (nSbrCh<<8): Number of SBR channels can be |
| * different to AAC channels to save data memory. |
| * |
| * \return |
| * - AACENC_OK, on succes. |
| * - AACENC_INVALID_HANDLE, AACENC_MEMORY_ERROR, AACENC_INVALID_CONFIG, |
| * on failure. |
| */ |
| AACENC_ERROR aacEncOpen(HANDLE_AACENCODER *phAacEncoder, const UINT encModules, |
| const UINT maxChannels); |
| |
| /** |
| * \brief Close the encoder instance. |
| * |
| * Deallocate encoder instance and free whole memory. |
| * |
| * \param phAacEncoder Pointer to the encoder handle to be deallocated. |
| * |
| * \return |
| * - AACENC_OK, on success. |
| * - AACENC_INVALID_HANDLE, on failure. |
| */ |
| AACENC_ERROR aacEncClose(HANDLE_AACENCODER *phAacEncoder); |
| |
| /** |
| * \brief Encode audio data. |
| * |
| * This function is mainly for encoding audio data. In addition the function can |
| * be used for an encoder (re)configuration process. |
| * - PCM input data will be retrieved from external input buffer until the fill |
| * level allows encoding a single frame. This functionality allows an external |
| * buffer with reduced size in comparison to the AAC or HE-AAC audio frame |
| * length. |
| * - If the value of the input samples argument is zero, just internal |
| * reinitialization will be applied if it is requested. |
| * - At the end of a file the flushing process can be triggerd via setting the |
| * value of the input samples argument to -1. The encoder delay lines are fully |
| * flushed when the encoder returns no valid bitstream data |
| * AACENC_OutArgs::numOutBytes. Furthermore the end of file is signaled by the |
| * return value AACENC_ENCODE_EOF. |
| * - If an error occured in the previous frame or any of the encoder parameters |
| * changed, an internal reinitialization process will be applied before encoding |
| * the incoming audio samples. |
| * - The function can also be used for an independent reconfiguration process |
| * without encoding. The first parameter has to be a valid encoder handle and |
| * all other parameters can be set to NULL. |
| * - If the size of the external bitbuffer in outBufDesc is not sufficient for |
| * writing the whole bitstream, an internal error will be the return value and a |
| * reconfiguration will be triggered. |
| * |
| * \param hAacEncoder A valid AAC encoder handle. |
| * \param inBufDesc Input buffer descriptor, see AACENC_BufDesc: |
| * - At least one input buffer with audio data is |
| * expected. |
| * - Optionally a second input buffer with |
| * ancillary data can be fed. |
| * \param outBufDesc Output buffer descriptor, see AACENC_BufDesc: |
| * - Provide one output buffer for the encoded |
| * bitstream. |
| * \param inargs Input arguments, see AACENC_InArgs. |
| * \param outargs Output arguments, AACENC_OutArgs. |
| * |
| * \return |
| * - AACENC_OK, on success. |
| * - AACENC_INVALID_HANDLE, AACENC_ENCODE_ERROR, on failure in encoding |
| * process. |
| * - AACENC_INVALID_CONFIG, AACENC_INIT_ERROR, AACENC_INIT_AAC_ERROR, |
| * AACENC_INIT_SBR_ERROR, AACENC_INIT_TP_ERROR, AACENC_INIT_META_ERROR, |
| * AACENC_INIT_MPS_ERROR, on failure in encoder initialization. |
| * - AACENC_UNSUPPORTED_PARAMETER, on incorrect input or output buffer |
| * descriptor initialization. |
| * - AACENC_ENCODE_EOF, when flushing fully concluded. |
| */ |
| AACENC_ERROR aacEncEncode(const HANDLE_AACENCODER hAacEncoder, |
| const AACENC_BufDesc *inBufDesc, |
| const AACENC_BufDesc *outBufDesc, |
| const AACENC_InArgs *inargs, AACENC_OutArgs *outargs); |
| |
| /** |
| * \brief Acquire info about present encoder instance. |
| * |
| * This function retrieves information of the encoder configuration. In addition |
| * to informative internal states, a configuration data block of the current |
| * encoder settings will be returned. The format is either Audio Specific Config |
| * in case of Raw Packets transport format or StreamMuxConfig in case of |
| * LOAS/LATM transport format. The configuration data block is binary coded as |
| * specified in ISO/IEC 14496-3 (MPEG-4 audio), to be used directly for MPEG-4 |
| * File Format or RFC3016 or RFC3640 applications. |
| * |
| * \param hAacEncoder A valid AAC encoder handle. |
| * \param pInfo Pointer to AACENC_InfoStruct. Filled on return. |
| * |
| * \return |
| * - AACENC_OK, on succes. |
| * - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure. |
| */ |
| AACENC_ERROR aacEncInfo(const HANDLE_AACENCODER hAacEncoder, |
| AACENC_InfoStruct *pInfo); |
| |
| /** |
| * \brief Set one single AAC encoder parameter. |
| * |
| * This function allows configuration of all encoder parameters specified in |
| * ::AACENC_PARAM. Each parameter must be set with a separate function call. An |
| * internal validation of the configuration value range will be done and an |
| * internal reconfiguration will be signaled. The actual configuration adoption |
| * is part of the subsequent aacEncEncode() call. |
| * |
| * \param hAacEncoder A valid AAC encoder handle. |
| * \param param Parameter to be set. See ::AACENC_PARAM. |
| * \param value Parameter value. See parameter description in |
| * ::AACENC_PARAM. |
| * |
| * \return |
| * - AACENC_OK, on success. |
| * - AACENC_INVALID_HANDLE, AACENC_UNSUPPORTED_PARAMETER, |
| * AACENC_INVALID_CONFIG, on failure. |
| */ |
| AACENC_ERROR aacEncoder_SetParam(const HANDLE_AACENCODER hAacEncoder, |
| const AACENC_PARAM param, const UINT value); |
| |
| /** |
| * \brief Get one single AAC encoder parameter. |
| * |
| * This function is the complement to aacEncoder_SetParam(). After encoder |
| * reinitialization with user defined settings, the internal status can be |
| * obtained of each parameter, specified with ::AACENC_PARAM. |
| * |
| * \param hAacEncoder A valid AAC encoder handle. |
| * \param param Parameter to be returned. See ::AACENC_PARAM. |
| * |
| * \return Internal configuration value of specifed parameter ::AACENC_PARAM. |
| */ |
| UINT aacEncoder_GetParam(const HANDLE_AACENCODER hAacEncoder, |
| const AACENC_PARAM param); |
| |
| /** |
| * \brief Get information about encoder library build. |
| * |
| * Fill a given LIB_INFO structure with library version information. |
| * |
| * \param info Pointer to an allocated LIB_INFO struct. |
| * |
| * \return |
| * - AACENC_OK, on success. |
| * - AACENC_INVALID_HANDLE, AACENC_INIT_ERROR, on failure. |
| */ |
| AACENC_ERROR aacEncGetLibInfo(LIB_INFO *info); |
| |
| #ifdef __cplusplus |
| } |
| #endif |
| |
| #endif /* AACENC_LIB_H */ |