Original address: https://www.jianshu.com/p/839b11e0638b
AAC is an audio coding format. AAC usually has a compression ratio of 18:1. Some data say it is 20:1, which is far better than mp3.
AAC audio formats include ADIF and ADTS:
ADIF: Audio Data Interchange Format. The characteristic of this format is that it can be determined to find the beginning of the audio data without decoding in the middle of the audio data stream, that is, its decoding must be carried out at the clearly defined beginning. Therefore, this format is often used in disk files.
ADTS: Audio Data Transport Stream. The characteristic of this format is that it is a bit stream with synchronous words, and decoding can start anywhere in the stream. Its characteristics are similar to mp3 data stream format.
In short, ADTS can be decoded in any frame, that is, it has header information in each frame. ADIF has only one unified header, so all data must be decoded. The formats of the two headers are also different. At present, the encoded and extracted audio streams are ADTS format.
ADTS is a frame sequence with stream characteristics, which is more suitable for audio stream transmission and processing.
Let's analyze ADTS:
ADTS AAC | ||||||
---|---|---|---|---|---|---|
ADTS_header | AAC ES | ADTS_header | AAC ES | ... | ADTS_header | AAC ES |
You can see that each frame of ADTS has header information, that is, ADTS_header, the relatively useful information in ADTS header is sampling rate, channel number and frame length. Generally, ADTS header information is 7 bytes, and if there is CRC, it is 9 bytes.
ADTS frame header structure:
Serial number | field | Length (bits) | explain |
---|---|---|---|
1 | Syncword | 12 | all bits must be 1 |
2 | MPEG version | 1 | 0 for MPEG-4, 1 for MPEG-2 |
3 | Layer | 2 | always 0 |
4 | Protection Absent | 1 | et to 1 if there is no CRC and 0 if there is CRC |
5 | Profile | 2 | the MPEG-4 Audio Object Type minus 1 |
6 | MPEG-4 Sampling Frequency Index | 4 | MPEG-4 Sampling Frequency Index (15 is forbidden) |
7 | Private Stream | 1 | set to 0 when encoding, ignore when decoding |
8 | MPEG-4 Channel Configuration | 3 | MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE) |
9 | Originality | 1 | set to 0 when encoding, ignore when decoding |
10 | Home | 1 | set to 0 when encoding, ignore when decoding |
11 | Copyrighted Stream | 1 | set to 0 when encoding, ignore when decoding |
12 | Copyrighted Start | 1 | set to 0 when encoding, ignore when decoding |
13 | Frame Length | 13 | this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame) |
14 | Buffer Fullness | 11 | buffer fullness |
15 | Number of AAC Frames | 2 | number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame |
16 | CRC | 16 | CRC if protection absent is 0 |
Generation of ADTS header:
/** * Add ADTS header * * @param packet ADTS header byte [], length 7 * @param packetLen The length of the frame, including the length of the header */ private void addADTStoPacket(byte[] packet, int packetLen) { int profile = 2; // AAC LC int freqIdx = 3; // 48000Hz int chanCfg = 2; // 2 Channel packet[0] = (byte) 0xFF; packet[1] = (byte) 0xF9; packet[2] = (byte) (((profile - 1) << 6) + (freqIdx << 2) + (chanCfg >> 2)); packet[3] = (byte) (((chanCfg & 3) << 6) + (packetLen >> 11)); packet[4] = (byte) ((packetLen & 0x7FF) >> 3); packet[5] = (byte) (((packetLen & 7) << 5) + 0x1F); packet[6] = (byte) 0xFC; }
The profile indicates which level of AAC is used. Three types are defined in MPEG-2 AAC:
AAC three levels
freqIdx indicates the subscript of the sampling rate used. Find the value of the sampling rate in the Sampling Frequencies [] array through this subscript:
- 0: 96000 Hz
- 1: 88200 Hz
- 2: 64000 Hz
- 3: 48000 Hz
- 4: 44100 Hz
- 5: 32000 Hz
- 6: 24000 Hz
- 7: 22050 Hz
- 8: 16000 Hz
- 9: 12000 Hz
- 10: 11025 Hz
- 11: 8000 Hz
- 12: 7350 Hz
- 13: Reserved
- 14: Reserved
- 15: frequency is written explictly
chanCfg indicates the number of channels:
- 0: Defined in AOT Specifc Config
- 1: 1 channel: front-center
- 2: 2 channels: front-left, front-right
- 3: 3 channels: front-center, front-left, front-right
- 4: 4 channels: front-center, front-left, front-right, back-center
- 5: 5 channels: front-center, front-left, front-right, back-left, back-right
- 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
- 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
- 8-15: Reserved
Analysis of AAC:
import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.nio.ByteBuffer; import java.util.HashMap; import java.util.Map; public class AACHelper { // Sampling frequency comparison table private static Map<Integer, Integer> samplingFrequencyIndexMap = new HashMap<>(); static { samplingFrequencyIndexMap.put(96000, 0); samplingFrequencyIndexMap.put(88200, 1); samplingFrequencyIndexMap.put(64000, 2); samplingFrequencyIndexMap.put(48000, 3); samplingFrequencyIndexMap.put(44100, 4); samplingFrequencyIndexMap.put(32000, 5); samplingFrequencyIndexMap.put(24000, 6); samplingFrequencyIndexMap.put(22050, 7); samplingFrequencyIndexMap.put(16000, 8); samplingFrequencyIndexMap.put(12000, 9); samplingFrequencyIndexMap.put(11025, 10); samplingFrequencyIndexMap.put(8000, 11); samplingFrequencyIndexMap.put(0x0, 96000); samplingFrequencyIndexMap.put(0x1, 88200); samplingFrequencyIndexMap.put(0x2, 64000); samplingFrequencyIndexMap.put(0x3, 48000); samplingFrequencyIndexMap.put(0x4, 44100); samplingFrequencyIndexMap.put(0x5, 32000); samplingFrequencyIndexMap.put(0x6, 24000); samplingFrequencyIndexMap.put(0x7, 22050); samplingFrequencyIndexMap.put(0x8, 16000); samplingFrequencyIndexMap.put(0x9, 12000); samplingFrequencyIndexMap.put(0xa, 11025); samplingFrequencyIndexMap.put(0xb, 8000); } private AdtsHeader mAdtsHeader = new AdtsHeader(); private BitReader mHeaderBitReader = new BitReader(new byte[7]); private byte[] mSkipTwoBytes = new byte[2]; private FileInputStream mFileInputStream; private byte[] mBytes = new byte[1024]; /** * Constructor to create an input stream by passing in the file path * * @param aacFilePath AAC File path * @throws FileNotFoundException */ public AACHelper(String aacFilePath) throws FileNotFoundException { mFileInputStream = new FileInputStream(aacFilePath); } /** * Get next Sample data * * @param byteBuffer ByteBuffer for storing Sample data * @return byte [] size of the current Sample. If it is empty, - 1 will be returned * @throws IOException */ public int getSample(ByteBuffer byteBuffer) throws IOException { if (readADTSHeader(mAdtsHeader, mFileInputStream)) { int length = mFileInputStream.read(mBytes, 0, mAdtsHeader.frameLength - mAdtsHeader.getSize()); byteBuffer.clear(); byteBuffer.put(mBytes, 0, length); byteBuffer.position(0); byteBuffer.limit(length); return length; } return -1; } /** * Read ADTS header from AAC file stream * * @param adtsHeader ADTS head * @param fileInputStream AAC File stream * @return Read successfully * @throws IOException */ private boolean readADTSHeader(AdtsHeader adtsHeader, FileInputStream fileInputStream) throws IOException { if (fileInputStream.read(mHeaderBitReader.buffer) < 7) { return false; } mHeaderBitReader.position = 0; int syncWord = mHeaderBitReader.readBits(12); // A if (syncWord != 0xfff) { throw new IOException("Expected Start Word 0xfff"); } adtsHeader.mpegVersion = mHeaderBitReader.readBits(1); // B adtsHeader.layer = mHeaderBitReader.readBits(2); // C adtsHeader.protectionAbsent = mHeaderBitReader.readBits(1); // D adtsHeader.profile = mHeaderBitReader.readBits(2) + 1; // E adtsHeader.sampleFrequencyIndex = mHeaderBitReader.readBits(4); adtsHeader.sampleRate = samplingFrequencyIndexMap.get(adtsHeader.sampleFrequencyIndex); // F mHeaderBitReader.readBits(1); // G adtsHeader.channelconfig = mHeaderBitReader.readBits(3); // H adtsHeader.original = mHeaderBitReader.readBits(1); // I adtsHeader.home = mHeaderBitReader.readBits(1); // J adtsHeader.copyrightedStream = mHeaderBitReader.readBits(1); // K adtsHeader.copyrightStart = mHeaderBitReader.readBits(1); // L adtsHeader.frameLength = mHeaderBitReader.readBits(13); // M adtsHeader.bufferFullness = mHeaderBitReader.readBits(11); // 54 adtsHeader.numAacFramesPerAdtsFrame = mHeaderBitReader.readBits(2) + 1; // 56 if (adtsHeader.numAacFramesPerAdtsFrame != 1) { throw new IOException("This muxer can only work with 1 AAC frame per ADTS frame"); } if (adtsHeader.protectionAbsent == 0) { fileInputStream.read(mSkipTwoBytes); } return true; } /** * Release resources * * @throws IOException */ public void release() throws IOException { mFileInputStream.close(); } /** * ADTS head */ private class AdtsHeader { int getSize() { return 7 + (protectionAbsent == 0 ? 2 : 0); } int sampleFrequencyIndex; int mpegVersion; int layer; int protectionAbsent; int profile; int sampleRate; int channelconfig; int original; int home; int copyrightedStream; int copyrightStart; int frameLength; int bufferFullness; int numAacFramesPerAdtsFrame; } }