Travel of ffmpeg Development (4): Analysis of MP3 Encoding Format and Compilation and Packaging of lame Library

Posted by lostboy on Wed, 29 May 2019 19:54:14 +0200

Travel of ffmpeg Development (4): Analysis of MP3 Encoding Format and Compilation and Packaging of lame Library

Reproduced please state the source: http://blog.csdn.net/andrexpert/article/77683776

I. Analysis of Mp3 Encoding Format

MP3, full-name MPEG Audio Layer 3, is an efficient computer audio coding scheme. It converts audio files into smaller files with a larger compression ratio (1:10 to 1:12), and can basically maintain the sound quality of the original files. If there is a 4-minute CD-quality WAV audio, its audio parameters are 44.1kHz sampling, stereo, sampling accuracy is 16 bits (2 bytes), then the space occupied by the audio is 441000*2(channel)*2(byte)*60(second)*4(minute)=40.4MB, while for MP3 format, MP3 audio only occupies about 4MB, which is conducive to storage and network transmission.
1. MP3 file structure
MP3 files are composed of frames, which are the smallest components of MP3 files. The MP3 audio file itself has no header. When you want to read the information about the MP3 audio file, you can read the header information of the first frame, so you can cut any part of the MP3 audio file and play it correctly. The whole MP3 file structure consists of three parts, namely TAG_V2(ID3V2), Frame and TAG_V1(ID3V1), which are described as follows:

2. MP3 Frame Format
Each frame is independent. It consists of frame header, additional information and sound data. Its length varies with bit rate. Usually, the playback time of each frame is 0.026 seconds. The frame structure of MP3 is as follows:

Each frame has a header of 4 bytes (32 bits). There may be two bytes of CRC check behind the header. The existence of these two bytes depends on the 16th bit of the header. If it is 0, there is no check behind the header, and there is a check behind the header. The frame header structure is as follows:
typedefstruct-tagHeader{ 
    unsigned int sync:        Occupy 11 place   //Synchronization information
    unsigned int version:    2;    //Edition
    unsigned int layer:          2;  //layer 
    unsigned int error2protection:     1;   //CRC correction
    unsigned int bit2rate2index:        4;   //Bit rate index
    unsigned int sample2rate2index: 2;   //Sampling Rate Index
    unsigned int padding:                  1;   //Blank character
    unsigned int extension:               1;    //Private logo
    unsigned int channel2mode:       2;   //Stereo Mode
    unsigned int modeextension:      2   ;//Retain
    unsigned int copyright:                1;  //Copyright logo
    unsigned int original:                   1;  //Original media
    unsigned int emphasis:               2   ;//Emphasis mode
  } HEADER;
Among them, sync is synchronous information, accounting for 11 bits, all set to 1; channel 2 mode is stereo channel mode, accounting for 2, 11 means Single stereo (Mono); other parameters see This article.
2. lame Compilation and Encapsulation
1. Introduction to Lame Library

     Lame Mike Cheng launched an open source project in 1998 and is currently the best MP3 coding engine. Lame encodes MP3 sound with pure color, wide space, clear bass and good details. Its original psychological sound model technology ensures the authenticity of CD audio restoration. With VBR and ABR parameters, the sound quality can almost match CD audio, but the file size is very small.

Download the latest version: https://sourceforge.net/projects/lame/files/lame/3.99/
2. Lame Library Compilation and Encapsulation

(1) Porting Lame Library to Android Project
Unzip lame-3.99.5 and copy the libmp3lame directory in the source code to the cpp directory of Android Project.
b. Rename libmp3lame to lame, and delete i386 directory, vector directory, depcomp, lame.rc, logoe.ico, Makefile.am, Makefile.in files;
c. Copy the lame.h file in the inlude directory of the source code to the lame directory in the Android project cpp directory, and the lame.h header file contains all the declarations of calling functions;
Configure CMakeLists.txt file
          set(SRC_DIR src/main/cpp/lame)
          include_directories(src/main/cpp/lame)
         aux_source_directory(src/main/cpp/lame SRC_LIST)
         add_library(...... ${SRC_LIST})
(2) LameMp3.java, create a native method that calls lame library functions
/** JNI Calling lame library to realize mp3 file encapsulation
 * Created by Jiangdg on 2017/6/9.
 */
public class LameMp3 {
   // Statically loading shared library LameMp3
    static {
        System.loadLibrary("LameMp3");
    }
    /** Initialize lame library and configure relevant information
     *
     * @param inSampleRate pcm Format audio sampling rate
     * @param outChannel pcm Number of format audio channels
     * @param outSampleRate mp3 Format audio sampling rate
     * @param outBitRate mp3 Format audio bit rate
     * @param quality mp3 Format audio quality, 0-9, slowest, worst, fastest and best
     */
    public native static void lameInit(int inSampleRate, int outChannel,int outSampleRate, int outBitRate, int quality);


    /** Encoding pcm into mp3 format
     *
     * @param letftBuf  Left pcm data
     * @param rightBuf Right pcm data, if monophonic, is consistent
     * @param sampleRate Read-in pcm byte size
     * @param mp3Buf Store mp3 data cache
     * @return Coded data byte length
     */
    public native static int lameEncode(short[] letftBuf, short[] rightBuf,int sampleRate, byte[] mp3Buf);


    /** Save mp3 audio stream to file
     *
     * @param mp3buf mp3 data stream
     * @return Data stream length rty
     */
    public native static int lameFlush(byte[] mp3buf);


    /**
     * Release lame library resources
     */
    public native static void lameClose();
}
Explanation: By looking at the API document of Lame library (lame-3.99.5\API), we can see that encapsulating Mp3 with Lame involves four steps: initializing lame engine, encoding pcm as MP3 data frame, writing files, and releasing lame engine resources. Therefore, in LameMp3. java, we define the corresponding native method for the Java layer to call, and finally generate the required MP3 format file.
(3) LameMp3.c
// Local implementation
// Created by jianddongguo on 2017/6/14.
#include <jni.h>
#include "LameMp3.h"
#include "lame/lame.h"
// Declare a lame_global_struct pointer variable
// Think of it as a global context
static lame_global_flags *gfp = NULL;


JNIEXPORT void JNICALL
Java_com_teligen_lametomp3_LameMp3_lameInit(JNIEnv *env, jclass type, jint inSampleRate,
jint outChannelNum, jint outSampleRate, jint outBitRate,
        jint quality) {
    if(gfp != NULL){
        lame_close(gfp);
        gfp = NULL;
    }
    //  Initialize the encoder engine and return a lame_global_flags structure type pointer
    //  Explains that the memory allocation required for encoding is complete, otherwise NULL is returned 
    gfp = lame_init();
    LOGI("Initialization lame Library completion");


    // Set the sampling rate of the input data stream by default of 44100 Hz
    lame_set_in_samplerate(gfp,inSampleRate);
    // Set the number of channels for the input data stream by default of 2
    lame_set_num_channels(gfp,outChannelNum);
    // Set the sampling rate of output data stream, default is 0, unit KHz
    lame_set_out_samplerate(gfp,outSampleRate);
    lame_set_mode(gfp,MPEG_mode);
     // Set bit compression rate, default 11
    lame_set_brate(gfp,outBitRate);
    // Coding quality, recommendation 2, 5, 7
    lame_set_quality(gfp,quality);
    // configuration parameter
    lame_init_params(gfp);
    LOGI("To configure lame Parameter completion");
}


JNIEXPORT jint JNICALL
        Java_com_teligen_lametomp3_LameMp3_lameFlush(JNIEnv *env, jclass type, jbyteArray mp3buf_) {
    jbyte *mp3buf = (*env)->GetByteArrayElements(env, mp3buf_, NULL);
    jsize len = (*env)->GetArrayLength(env,mp3buf_);
    // Refresh the pcm cache to "0" padding to ensure the integrity of the last few frames
    // Refresh the mp3 cache and return the last frames
    int resut = lame_encode_flush(gfp,        // Global context
    mp3buf, // Pointer to mp3 cache
    len);  // Effective mp3 data length
    (*env)->ReleaseByteArrayElements(env, mp3buf_, mp3buf, 0);
    LOG_I("Write in mp3 Data to file, number of frames returned=%d",resut);
    return  resut;
}


JNIEXPORT void JNICALL
Java_com_teligen_lametomp3_LameMp3_lameClose(JNIEnv *env, jclass type) {
    // Release occupied memory resources
    lame_close(gfp);
    gfp = NULL;
    LOGI("release lame Resources");
}


JNIEXPORT jint JNICALL
Java_com_teligen_lametomp3_LameMp3_lameEncode(JNIEnv *env, jclass type, jshortArray letftBuf_,
                                              jshortArray rightBuf_, jint sampleRate,
                                              jbyteArray mp3Buf_) {
    if(letftBuf_ == NULL || mp3Buf_ == NULL){
        LOGI("letftBuf and rightBuf or mp3Buf_Can not be empty");
        return -1;
    }
    jshort *letftBuf = NULL;
    jshort *rightBuf = NULL;
    if(letftBuf_ != NULL){
        letftBuf = (*env)->GetShortArrayElements(env, letftBuf_, NULL);
    }
    if(rightBuf_ != NULL){
        rightBuf = (*env)->GetShortArrayElements(env, rightBuf_, NULL);
    }
    jbyte *mp3Buf = (*env)->GetByteArrayElements(env, mp3Buf_, NULL);
    jsize readSizes = (*env)->GetArrayLength(env,mp3Buf_);
    // Encoding PCM data to mp3
    int result = lame_encode_buffer(gfp, // Global context
                                  letftBuf,    // Left channel pcm data
                                  rightBuf,   // Right channel pcm data
                                  sampleRate, // Sampling Rate of Channel Data Stream
                                  mp3Buf, // mp3 data cache start address
                                   readSizes);      // Valid mp3 data length in cache address
    // Release resources
    if(letftBuf_ != NULL){
        (*env)->ReleaseShortArrayElements(env, letftBuf_, letftBuf, 0);
    }
    if(rightBuf_ != NULL){
        (*env)->ReleaseShortArrayElements(env, rightBuf_, rightBuf, 0);
    }
    (*env)->ReleaseByteArrayElements(env, mp3Buf_, mp3Buf, 0);
    LOG_I("Code pcm by mp3,Data length=%d",result);
    return  result;
}
Explanation: By looking at the lame.h source code, gfp is a pointer variable to the structure lame_global_struct, which is used to point to the structure. The lame_global_struct structure declares the various parameters required for encoding, as follows:
lame_global_flags *gfp = NULL;
typedef struct lame_global_struct lame_global_flags;
struct lame_global_struct {
    unsigned int class_id;
    unsigned long num_samples; 
    int     num_channels;    
    int     samplerate_in;  
    int     samplerate_out;    brate;          
    float   compression_ratio; 
    .....
}
In addition, when configuring the lame encoding engine, there is a lame_set_quality function to set the quality of the encoding. Perhaps you will ask, the quality of audio coding is not generally determined by the bit rate, why do you need this setting? Well, it's good that bit rate determines the quality of coding. The parameters here are mainly used to select the algorithm of coding processing. The effect and speed of different algorithms are different. For example, when quality is 0, the selected algorithm is the best, but the processing speed is the slowest; when quality is 9, the selected algorithm is the worst, but the speed is the fastest. Usually, the three settings are officially recommended, namely:
quality= 2. The quality is close to the best and the speed is not very slow.
quality=5. Good quality and good speed.
quality=7. Good quality and fast speed.
(4) CMakeList.txt
#Specify the minimum version of Cmake required
cmake_minimum_required(VERSION 3.4.1)


#Specify the source path, assigning the src/main/cpp/lame path to SRC_DIR 
set(SRC_DIR src/main/cpp/lame)
# Specify header file path
include_directories(src/main/cpp/lame)
# Assign all file names in the src/main/cpp/lame directory to SRC_LIST
aux_source_directory(src/main/cpp/lame SRC_LIST)


# add_library: Specifies the build library file, including three parameters:
# LameMp3 is the name of the library file; SHARED is the dynamic link library;
# src/main/cpp/LameMp3.c and ${SRC_LIST} specify the source files needed to generate library files
#Where ${} is used to introduce all source files in the src/main/cpp/lame directory
add_library(
             LameMp3
             SHARED
             src/main/cpp/LameMp3.c ${SRC_LIST})
#Search the library log in the specified directory and save its path to the variable log-lib
find_library( # Sets the name of the path variable.
              log-lib
              # Specifies the name of the NDK library that
              # you want CMake to locate.
              log )
# Link the library ${log-lib} to the LameMp3 dynamic library with two parameters
#LameMp3 is the target library
# ${log-lib} is the library to link to
target_link_libraries( # Specifies the target library.
                       LameMp3


                       # Links the target library to the log library
                       # included in the NDK.
                       ${log-lib} )
Explanation: Cmake is a cross-platform compilation tool that allows simple statements to describe the compilation process of all platforms and outputs various types of Makefile or Project files. All statements and commands of Cmake are written in the CMakeLists.txt file. The main rules are as follows:
In Cmake, the comment begins with the # character and ends with the line.
b. Commands are case-insensitive and parameters are case-insensitive.
c. Commands consist of command names and parameter lists separated by spaces.
(5) build.gradle(Module app), select compilation platform
android {
   
    defaultConfig {
        // ... Code omission
        externalNativeBuild {
            cmake {
                cppFlags ""
            }
        }
// Selecting Compiler Platform
        ndk{
            abiFilters 'x86', 'x86_64', 'armeabi', 'armeabi-v7a','arm64-v8a'
        }
    }
    // ... Code omission
    externalNativeBuild {
        cmake {
            path "CMakeLists.txt"
        }
    }
}
Open Source Project: Lame4Mp3
Lame4Mp3 is an open source project based on Lame library. This project combines with the MediaCodec API provided by Android, which can encode PCM data stream into AAC or MP3 format data, and supports both AAC and MP3 coding. It is suitable for local recording of mp3/aac files and broadcasting side recording (mp3) in live Android broadcasting. The usage method and source code analysis are as follows:
1. Adding dependencies
(1) Add in the project build.gradle
allprojects {
   repositories {
    ...
   maven { url 'https://jitpack.io' }
  }
}
(2) Add to module's gradle
dependencies {
   compile 'com.github.jiangdongguo:Lame4Mp3:v1.0.0'
}
2. Use of Lame4Mp3
(1) Configuration parameters
 Mp3Recorder mMp3Recorder = Mp3Recorder.getInstance();
   // Configure AudioRecord parameters
   mMp3Recorder.setAudioSource(Mp3Recorder.AUDIO_SOURCE_MIC);
   mMp3Recorder.setAudioSampleRare(Mp3Recorder.SMAPLE_RATE_8000HZ);
   mMp3Recorder.setAudioChannelConfig(Mp3Recorder.AUDIO_CHANNEL_MONO);
   mMp3Recorder.setAduioFormat(Mp3Recorder.AUDIO_FORMAT_16Bit);
   // Configuring Lame parameters
   mMp3Recorder.setLameBitRate(Mp3Recorder.LAME_BITRATE_32);
   mMp3Recorder.setLameOutChannel(Mp3Recorder.LAME_OUTCHANNEL_1);
   // Configure MediaCodec parameters
   mMp3Recorder.setMediaCodecBitRate(Mp3Recorder.ENCODEC_BITRATE_1600HZ);
   mMp3Recorder.setMediaCodecSampleRate(Mp3Recorder.SMAPLE_RATE_8000HZ);
   // Setup mode
   //  Mp3Recorder.MODE_AAC only encodes AAC data stream
   //  Mp3Recorder.MODE_MP3 only encodes Mp3 files
   //  Mp3Recorder.MODE_BOTH Simultaneous Coding
   mMp3Recorder.setMode(Mp3Recorder.MODE_BOTH);
(2) Start coding
   mMp3Recorder.start(filePath, fileName, new Mp3Recorder.OnAACStreamResultListener() {
       @Override
       public void onEncodeResult(byte[] data, int offset, int length, long timestamp) {
              Log.i("MainActivity","acc Data stream length:"+data.length);
          }
       });
(3) Stop coding
mMp3Recorder.stop();
3. Lame4Mp3 source code parsing
Mp3Recorder.java mainly includes three functional blocks: PCM data acquisition, AAC coding and Mp3 coding. Among them, PCM data acquisition and AAC coding have been analyzed in detail in previous blog posts, so here only focuses on the analysis of Mp3 coding, the core code is as follows:
public void start(final String filePath, final String fileName,final OnAACStreamResultListener listener){
        this.listener = listener;
        new Thread(new Runnable() {
            @Override
            public void run() {
                try {
                    if(!isRecording){
                        // Step 1: Initialize the lame engine
                        initLameMp3();
                        initAudioRecord();
                        initMediaCodec();
                    }
                    int readBytes = 0;
                    byte[] audioBuffer = new byte[2048];
                    byte[] mp3Buffer = new byte[1024];
                    // If the file path does not exist, create
                    if(TextUtils.isEmpty(filePath) || TextUtils.isEmpty(fileName)){
                        Log.i(TAG,"File path or file name is empty");
                        return;
                    }
                    File file = new File(filePath);
                    if(! file.exists()){
                        file.mkdirs();
                    }
                    String mp3Path = file.getAbsoluteFile().toString()+File.separator+fileName+".mp3";
                    FileOutputStream fops = null;
                    try {
                        while(isRecording){
                            readBytes = mAudioRecord.read(audioBuffer,0,bufferSizeInBytes);
                            Log.i(TAG,"read pcm Data stream, size:"+readBytes);
                            if(readBytes >0 ){
                                if(mode == MODE_AAC || mode == MODE_BOTH){
                                    // Coding PCM to AAC
                                    encodeBytes(audioBuffer,readBytes);
                                }
                                if(mode == MODE_MP3 || mode == MODE_BOTH){
                                    // Open mp3 file output stream
                                    if(fops == null){
                                        try {
                                            fops = new FileOutputStream(mp3Path);
                                        } catch (FileNotFoundException e) {
                                            e.printStackTrace();
                                        }
                                    }
                                    // Convert byte [] to short []
                                    // Code PCM as Mp3 and write to file
                                    short[] data = transferByte2Short(audioBuffer,readBytes);
                                    int encResult = LameMp3.lameEncode(data,null,data.length,mp3Buffer);
                                    Log.i(TAG,"lame Encoding, size:"+encResult);
                                    if(encResult != 0){
                                        try {
                                            fops.write(mp3Buffer,0,encResult);
                                        } catch (IOException e) {
                                            e.printStackTrace();
                                        }
                                    }
                                }
                            }
                        }
                        // Recording completed
                        if(fops != null){
                            int flushResult =  LameMp3.lameFlush(mp3Buffer);
                            Log.i(TAG,"After recording, the size is:"+flushResult);
                            if(flushResult > 0){
                                try {
                                    fops.write(mp3Buffer,0,flushResult);
                                } catch (IOException e) {
                                    e.printStackTrace();
                                }
                            }
                            try {
                                fops.close();
                            } catch (IOException e) {
                                e.printStackTrace();
                            }
                        }
                    }finally {
                        Log.i(TAG,"release AudioRecorder Resources");
                        stopAudioRecorder();
                        stopMediaCodec();


                    }
                }finally {
                    Log.i(TAG,"release Lame Library resources");
                    stopLameMp3();
                }
            }
        }).start();
    }
From the code, it can be seen that using lame engine to encode PCM to get MP3 data will go through four steps: initializing engine, encoding, writing files, releasing memory resources, which is consistent with the process we analyzed in detail before. However, it should be noted that when AAC and MP3 are coded simultaneously, PCM data streams are input to MediaCodec and Lame engines in different ways. The former only accepts byte [] stored data, while the latter receives short [] stored data. That is to say, if the collected PCM data is stored in byte [], we need to convert it to short [], and we need to pay attention to the size of the end. The code is as follows:
 
   private short[] transferByte2Short(byte[] data,int readBytes){
        // byte [] turns short [], and the array length is halved
        int shortLen = readBytes / 2;
        // Assemble byte [] numbers as ByteBuffer buffers
        ByteBuffer byteBuffer = ByteBuffer.wrap(data, 0, readBytes);
        // Convert ByteBuffer to a small end and get short Buffer
        // Small-end: High bytes of data are saved to high addresses in memory, and low bytes of data are saved to low addresses in memory.
        ShortBuffer shortBuffer = byteBuffer.order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();
        short[] shortData = new short[shortLen];
        shortBuffer.get(shortData, 0, shortLen);
        return shortData;
    }


GitHub address: https://github.com/jiangdongguo/Lame4Mp3 Welcome to star ~ (attached) LameToMp3 NDK project)


Topics: encoding cmake Android Java