Android audio and video [13] Introduction to OpenSL ES & audio acquisition based on OpenSL ES

Posted by Arkane on Tue, 01 Feb 2022 03:15:09 +0100

Human observation
Don't practice yourself in the hearts of others,
Don't force others in your heart.

preface

I've been a little lazy in writing articles recently. It's about a month since I wrote articles last time.

Generally, Android audio is collected in the java layer using AudioRecord class.

But why learn OpenSL? In addition to the performance advantages of C/C + + (but in fact, the efficiency of java is not low), the most important thing is that if you use the interface of java layer, you also need to pass through a layer of jni, which is more complex and consumes a lot of performance. If you use OpenSL, you can handle everything directly in C/C + +. Therefore, sometimes in order to develop more efficient Android audio applications, it is necessary to directly record, collect and play at the bottom, eliminating the communication between java and jni layers.

This article mainly introduces how Android uses OpenSL ES to collect audio in JNI layer.

Introduction to OpenSL ES

Some of the introductions in this section are taken from the network, only for learning and use, and only for a better and more comprehensive introduction to OpenSL ES. If there is infringement, it will be deleted!

What is OpenSL ES

This was also introduced in the previous article when playing pcm data using OpenSL ES. Here's another introduction.

The full name of OpenSL ES is Open Sound Library for Embedded Systems, which is the embedded audio acceleration standard. OpenSL ES is a hardware audio acceleration API with no license fee, cross platform and carefully optimized for embedded systems. It provides a standardized, high-performance and low response time audio function implementation method for local application developers on embedded mobile multimedia devices. At the same time, it also realizes the direct cross platform deployment of software / hardware audio performance, which not only reduces the difficulty of implementation, but also promotes the development of advanced audio market. In short, OpenSL ES is an embedded cross platform free audio processing library. So it's not unique to Android.

Relationship between OpenSL ES and Android

Official website address: https://source.android.com/devices/audio/latency_app.html

We can see that OpenSL ES implemented by Android is only a subset of OpenSL 1.0.1 and has been extended. Therefore, for the use of OpenSL ES API, we also need to pay special attention to what is supported by Android and what is not supported.

Features, advantages and disadvantages of OpenSL ES

characteristic:

(1)C Language interface, compatible C++，Need in NDK Under the development of, it can be better integrated in native In application
(2)Run on native Layer, which needs to manage the application and release of resources by itself Dalvik Garbage collection mechanism of virtual machine
(3)support PCM Data collection, supported configurations: 16 bit Bit width, 16000 Hz Sampling rate, single channel. (other configurations cannot guarantee compatibility with all platforms)
(4)support PCM Data playback, supported configurations: 8 bit/16bit Bit width, single channel/Dual channel, small end mode, sampling rate (8000), 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000 Hz)
(5)Audio data sources supporting playback: res Audio under folder assets Audio under folder sdcard Audio in the directory, online network audio, audio binary data defined in the code, etc

Advantages:

(1)Avoid frequent use of audio data native Layer and Java Layer copy to improve efficiency
(2)Compared to Java API，The parameters can be controlled more flexibly
(3)Because it is C Code, so you can do in-depth optimization, such as using NEON optimization
(4)Code details are more difficult to decompile

inferiority:

(1)Version less than is not supported Android 2.3 (API 9) Equipment
(2)Not all implemented OpenSL ES Defined features and functions
(3)I won't support it MIDI 
(4)Direct playback is not supported DRM Or encrypted content
(5)The encoding and decoding of audio data is not supported. If encoding and decoding is required, it needs to be used MediaCodec API Or a third-party library
(6)In terms of audio delay, compared with the upper layer API，There is no particularly obvious improvement

More introduction

You can take a look at the android Chinese official website. There are also some demo s in it

https://developer.android.google.cn/ndk/guides/audio/opensl

ok, after understanding some background of OpenSL ES, let's start to introduce the API of OpenSL ES and the function of recording and acquisition.

add permission

If you want to record, whether you use AudioRecord in java layer or OpenSL ES at the bottom, you need to use Android manifest Add recording permission to the configuration file of XML.

<uses-permission android:name="android.permission.RECORD_AUDIO"/>

Reference relevant library files and header files

To use OpenSLES in Android, you first need to link the so provided by the Android system to your own so outside. At cmakelists Txt script to add the link library OpenSLES. The name of the library can be found in your ndk directory, similar to the following directory:

/Users/guxiuzhong/Library/Android/sdk/ndk/21.1.6352462/platforms/android-19/arch-x86/usr/lib/libOpenSLES.so

Then you need to remove the lib prefix. When the project is compiled, it will automatically find this directory.

CMake mode is in cmakelists Txt

target_link_libraries(
                OpenSLES
   							// ... Omit other so that need to be linked
        )

NDK Build mode is in Makefile file Android MK add link options

LOCAL_LDLIBS += -lOepnSLES

Header file addition

#include <SLES/OpenSLES.h>
#include <SLES/OpenSLES_Android.h>

Important concepts of OpenSLES

Objects and Interfaces

There are two concepts that must be understood in the development of OpenSL ES, one is object and the other is Interface. Because many APIs operate on objects and interfaces. In order to use OpenSL ES more conveniently, Android has designed the API of OpenSL ES to be similar to the use of object-oriented Java. Object can be imagined as Java's object class, and Interface can be imagined as Java's Interface, but they are not exactly the same. Their relationship:
(1) Each Object may have one or more interfaces. The official defines a series of interfaces for each Object
(2) Each Object object provides some basic operations, such as Realize, Resume, GetState, Destroy, etc. if you want to use the function function supported by the Object, you must get the Interface through its GetInterface function, and then access the function function through the Interface
(3) Not all interfaces defined by OpenSL ES for objects are implemented on every system, so some choices and judgments need to be made when obtaining interfaces

In OpenSL, all objects we get are SLObjectItf:

typedef const struct SLObjectItf_ * const * SLObjectItf;
struct SLObjectItf_ {
	SLresult (*Realize) (
		SLObjectItf self,
		SLboolean async
	);
	SLresult (*Resume) (
		SLObjectItf self,
		SLboolean async
	);
	SLresult (*GetState) (
		SLObjectItf self,
		SLuint32 * pState
	);
	SLresult (*GetInterface) (
		SLObjectItf self,
		const SLInterfaceID iid,
		void * pInterface
	);
	SLresult (*RegisterCallback) (
		SLObjectItf self,
		slObjectCallback callback,
		void * pContext
	);
	void (*AbortAsyncOperation) (
		SLObjectItf self
	);
	void (*Destroy) (
		SLObjectItf self
	);
	SLresult (*SetPriority) (
		SLObjectItf self,
		SLint32 priority,
		SLboolean preemptable
	);
	SLresult (*GetPriority) (
		SLObjectItf self,
		SLint32 *pPriority,
		SLboolean *pPreemptable
	);
	SLresult (*SetLossOfControlInterfaces) (
		SLObjectItf self,
		SLint16 numInterfaces,
		SLInterfaceID * pInterfaceIDs,
		SLboolean enabled
	);
};

Any created Object must call the Realize method for initialization. When it is not needed, you can use the Destroy method to release resources.

GetInterface

GetInterface is the most frequently used method in OpenSL. Through it, we can get the Interface in the Object.

Because an Object may contain multiple interfaces, the GetInterface method has an SLInterfaceID parameter to specify the Interface in the Object to be obtained.

For example, we obtain SL through EngineObject_ IID_ Engine is the Interface of this id, and the Interface corresponding to this id is SLEngineItf:

    SLEngineItf engineEngine = NULL;
    SLObjectItf engineObject = NULL;
    // Create an engine object and call the global method to create an engine object (OpenSL ES unique entry)
    SLresult result;
    result = slCreateEngine(&engineObject, 1, pEngineOptions, 0, nullptr, nullptr);
    assert(SL_RESULT_SUCCESS == result);
    /* Realizing the SL Engine in synchronous mode. */
    //Instantiate this object
    result = (*engineObject)->Realize(engineObject, SL_BOOLEAN_FALSE);
    assert(SL_RESULT_SUCCESS == result);
    // get the engine interface, which is needed in order to create other objects
    //Get the engine interface from this object
    (*engineObject)->GetInterface(engineObject, SL_IID_ENGINE, &engineEngine);
    assert(SL_RESULT_SUCCESS == result);

After calling each API, check whether its return value is equal to "SL_RESULT_SUCCESS"

Interface

Interface is a collection of methods. For example, SLRecordItf contains methods related to recording and SLPlayItf contains methods related to playback. Our functions are realized by calling the methods of Interfaces.

For example, SLEngineItf is the most important Interface in OpenSL. We can use it to create various objects, such as player, recorder and mixer objects, and then use these objects to obtain various interfaces to realize various functions.

For example:

	SLresult (*CreateAudioPlayer) (
		SLEngineItf self,
		SLObjectItf * pPlayer,
		SLDataSource *pAudioSrc,
		SLDataSink *pAudioSnk,
		SLuint32 numInterfaces,
		const SLInterfaceID * pInterfaceIds,
		const SLboolean * pInterfaceRequired
	);
	SLresult (*CreateAudioRecorder) (
		SLEngineItf self,
		SLObjectItf * pRecorder,
		SLDataSource *pAudioSrc,
		SLDataSink *pAudioSnk,
		SLuint32 numInterfaces,
		const SLInterfaceID * pInterfaceIds,
		const SLboolean * pInterfaceRequired
	);
// There are many others, you can see opensles H header file

Object lifecycle

OpenSL ES objects generally have three states: UNREALIZED, REALIZED, and SUSPENDED.

When an object is in the unrealized state, the system will not allocate resources to it; After calling the Realize method, it will enter the realized state. At this time, all functions and resources of the object can be accessed normally; When the audio related hardware devices are occupied by other processes, the OpenSL ES Object will enter the SUSPENDED (suspend) state, and then the Resume method can be used to return the object to the REALIZED (available) state. When the Object is finished, the Destroy method is released to release the resource, which is the state where the object returns to the UNREALIZED (unavailable) state.

sound recording

demo simply saves the collected pcm file to the file in the sd card directory, mainly how to use OpenSL ES to collect audio pcm data.

If you understand the design idea of OpenSL ES and the API use process code of Object and Interface, you can easily understand it.

Create engine object

You need to call the global method slCreateEngine() to create

SL_API SLresult SLAPIENTRY slCreateEngine(
	SLObjectItf             *pEngine, // Engine object address, used for outgoing object, is an output parameter
	SLuint32                numOptions, //Number of configuration parameters, pass to 1
	const SLEngineOption    *pEngineOptions,// Enumeration, parameter configuration
	SLuint32                numInterfaces,//Number of interfaces supported
	const SLInterfaceID     *pInterfaceIds,//The specific interfaces to be supported are enumerated arrays
	const SLboolean         * pInterfaceRequired//Whether the interface to be supported is open or closed is also an array, and the length of the last three parameters is the same
);

After creation, follow the three-step standard process described above, call Realize to instantiate the object, and then obtain the engine interface from the object through GetInterface. Therefore, the code snippet for creating the engine object is:

// Member variable
SLEngineItf engineEngine = NULL;
SLObjectItf engineObject = NULL;

void AudioRecorder::createEngine() {
    SLEngineOption pEngineOptions[] = {(SLuint32) SL_ENGINEOPTION_THREADSAFE,
                                       (SLuint32) SL_BOOLEAN_TRUE};
    // Create an engine object, / / call the global method to create an engine object (OpenSL ES unique entry)
    SLresult result;
    result = slCreateEngine(
            &engineObject, //Object address for outgoing objects
            1, //Number of configuration parameters
            pEngineOptions, //Configuration parameters, enumerating arrays
            0,  //Number of interfaces supported
            nullptr, //The specific interfaces to be supported are enumerated arrays
            nullptr//Whether the interface to be supported is open or closed is also an array, and the length of these three parameters is the same
            );
    assert(SL_RESULT_SUCCESS == result);
    /* Realizing the SL Engine in synchronous mode. */
    //Instantiate this object
    result = (*engineObject)->Realize(engineObject, SL_BOOLEAN_FALSE);
    assert(SL_RESULT_SUCCESS == result);
    // get the engine interface, which is needed in order to create other objects
    //Get the engine interface from this object
    (*engineObject)->GetInterface(engineObject, SL_IID_ENGINE, &engineEngine);
    assert(SL_RESULT_SUCCESS == result);
}

Set the input / output configuration of the acquisition device (microphone)

    SLDataLocator_IODevice ioDevice = {
            SL_DATALOCATOR_IODEVICE,  //Type here can only be SL_DATALOCATOR_IODEVICE
            SL_IODEVICE_AUDIOINPUT,//device type selects the audio input type
            SL_DEFAULTDEVICEID_AUDIOINPUT, //The deviceID corresponds to SL_DEFAULTDEVICEID_AUDIOINPUT
            NULL//device instance
    };
    // Input, SLDataSource indicates the information of audio data source
    SLDataSource recSource = {
            &ioDevice,//SLDataLocator_IODevice configuration input
            NULL//Input format, not required for acquisition
    };
    // Data source simple buffer queue locator, output buffer queue
    SLDataLocator_AndroidSimpleBufferQueue recBufferQueue = {
            SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, //Type here can only be SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE
            NUM_BUFFER_QUEUE //Number of buffer s
    };
    // PCM data source format / / set the format of output data
    SLDataFormat_PCM pcm = {
            SL_DATAFORMAT_PCM, //Output data in PCM format
            2,  //  //Number of output channels 2 channels (stereo)
            SL_SAMPLINGRATE_44_1, //The sampling frequency of the output, here is 44100Hz
            SL_PCMSAMPLEFORMAT_FIXED_16, //Output sampling format, here is 16bit
            SL_PCMSAMPLEFORMAT_FIXED_16,//In general, follow the previous parameter
            SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,//Dual channel configuration, if single channel can use SL_SPEAKER_FRONT_CENTER
            SL_BYTEORDER_LITTLEENDIAN //Size end arrangement of PCM data
    };
    // Output, SLDataSink indicates audio data output information
    SLDataSink dataSink = {
            &recBufferQueue, //SLDataFormat_PCM configuration output
            &pcm //Output data format
    };

SLDataSource and SLDataSink structures in OpenSL ES are mainly used to build audio player and recorder objects. SLDataSource represents the information of audio data source and SLDataSink represents the output information of audio data.

Create recorder - create a recording object and get the interface related to recording

Use SLEngineItf to create a recorder and call the CreateAudioRecorder method

    //Create a recorded object and specify an open SL_IID_ANDROIDSIMPLEBUFFERQUEUE this interface
    SLInterfaceID iids[NUM_RECORDER_EXPLICIT_INTERFACES] = {SL_IID_ANDROIDSIMPLEBUFFERQUEUE,
                                                            SL_IID_ANDROIDCONFIGURATION};
    SLboolean required[NUM_RECORDER_EXPLICIT_INTERFACES] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};
    /* Create the audio recorder */
    // Create audio recorder object
    result = (*engineEngine)->CreateAudioRecorder(engineEngine, //Engine interface
                                                  &recorderObject, //Recording object address for outgoing objects
                                                  &recSource,//Input configuration
                                                  &dataSink,//Output configuration
                                                  NUM_RECORDER_EXPLICIT_INTERFACES,//Number of interfaces supported
                                                  iids, //Specific interfaces to be supported
                                                  required //Is the specific interface to be supported open or closed
    );
    assert(SL_RESULT_SUCCESS == result);

    assert(SL_RESULT_SUCCESS == result);
    /* Realize the recorder in synchronous mode. */ //Instantiate this recording object
    result = (*recorderObject)->Realize(recorderObject, SL_BOOLEAN_FALSE);
    assert(SL_RESULT_SUCCESS == result);
    /* Get the buffer queue interface which was explicitly requested *///Get Buffer interface
    result = (*recorderObject)->GetInterface(recorderObject, SL_IID_ANDROIDSIMPLEBUFFERQUEUE,
                                             (void *) &recorderBuffQueueItf);
    assert(SL_RESULT_SUCCESS == result);

    /* get the record interface */ //Get recording interface
    (*recorderObject)->GetInterface(recorderObject, SL_IID_RECORD, &recorderRecord);
    assert(SL_RESULT_SUCCESS == result);

Or first create the Object, then call the Realize method to initialize, then call GetInterface to get the Interface in Object. SL_IID_RECORD is the id of the interface method of the recorder, and the output parameter recorderRecord is the recording interface. Subsequent start recording and stop recording are operated on it.

Set the data callback method and start recording

Obtain the recorded audio PCM data by calling back the function RegisterCallback and setting the start recording state.

    buffer = new uint8_t[BUFFER_SIZE]; //Data buffer,
    bufferSize = BUFFER_SIZE;
    //Set the data callback interface AudioRecorderCallback. The last parameter is to transfer custom context references
    (*recorderBuffQueueItf)->RegisterCallback(recorderBuffQueueItf, AudioRecorderCallback, this);
    assert(SL_RESULT_SUCCESS == result);
    /* Start recording */
    // Start recording audio, / / set the recorder to recording state SL_RECORDSTATE_RECORDING
    result = (*recorderRecord)->SetRecordState(recorderRecord, SL_RECORDSTATE_RECORDING);
    assert(SL_RESULT_SUCCESS == result);

    // After setting the recording status, Enqueue must be used first, so that the collection callback can be started
    /* Enqueue buffers to map the region of memory allocated to store the recorded data */
    (*recorderBuffQueueItf)->Enqueue(recorderBuffQueueItf, buffer, BUFFER_SIZE);
    assert(SL_RESULT_SUCCESS == result);
    LOGD("Starting recording tid=%ld", syscall(SYS_gettid));//Thread id

The first parameter of RegisterCallback is to obtain the Buffer interface, and the second parameter is the address of a function. When recording

OpenSL ES will automatically callback. It should be noted that the callback method AudioRecorderCallback is not a callback in the UI thread, but a child thread. The third parameter is the parameter passed in to the callback method, which can pass any data. Here, this is passed in to facilitate obtaining some member variables in the callback method. The callback function is as follows:

void AudioRecorderCallback(SLAndroidSimpleBufferQueueItf bufferQueueItf, void *context) {
    //Note that this is another collection thread callback
    AudioRecorder *recorderContext = (AudioRecorder *) context;
    assert(recorderContext != NULL);
    if (recorderContext->buffer != NULL) {
        fwrite(recorderContext->buffer, recorderContext->bufferSize, 1, recorderContext->pfile);
        LOGD("save a frame audio data,pid=%ld", syscall(SYS_gettid));
        SLresult result;
        SLuint32 state;
        result = (*(recorderContext->recorderRecord))->GetRecordState(
                recorderContext->recorderRecord, &state);
        assert(SL_RESULT_SUCCESS == result);
        (void) result;
        LOGD("state=%d", state);
        if (state == SL_RECORDSTATE_RECORDING) {
            //After fetching data, Enqueue needs to be called to trigger the next data callback
            result = (*bufferQueueItf)->Enqueue(bufferQueueItf, recorderContext->buffer,
                                                recorderContext->bufferSize);
            assert(SL_RESULT_SUCCESS == result);
            (void) result;
        }
    }
}

Note: Enqueue must be performed once after setting the recording status, so that the collection callback can be started. Then Enqueue must be performed again after each processing to achieve circular recording.

Stop recording

For stopping recording, change the recording status of the recording interface SLRecordItf to SL_RECORDSTATE_STOPPED

void AudioRecorder::stopRecord() {
    // Stop recording
    if (recorderRecord != nullptr) {
        //Set the recorder to stop_ RECORDSTATE_ STOPPED
        SLresult result = result = (*recorderRecord)->SetRecordState(recorderRecord,
                                                                     SL_RECORDSTATE_STOPPED);
        assert(SL_RESULT_SUCCESS == result);
        fclose(pfile);
        pfile = nullptr;
        delete buffer;
        LOGD("stopRecord done");
    }
}

Free up resources for recording and OpenSL ES

// Release resources, release OpenSL ES resources
void AudioRecorder::release() {
    //Only the OpenSL ES object needs to be destroyed, and the interface does not need to be destroyed.
    if (recorderObject != nullptr) {
        (*recorderObject)->Destroy(recorderObject);
        recorderObject = NULL;
        recorderRecord = NULL;
        recorderBuffQueueItf = NULL;
        configItf = NULL;
    }
    // destroy engine object, and invalidate all associated interfaces
    if (engineObject != NULL) {
        // Freeing resources for engine objects
        (*engineObject)->Destroy(engineObject);
        engineObject = NULL;
        engineEngine = NULL;
    }
    LOGD("release done");

Call test

There is an Audacity software that can play uncompressed audio pcm files. Of course, you can also play with AudioTrack and OpenSL ES introduced in the previous article to verify whether the sound is ok.

The JNI layer has the following code, which is easy to paste:

AudioRecorder *audioRecorder;
extern "C"
JNIEXPORT void JNICALL
Java_com_bj_gxz_pcmplay_AudioRecorder_startRecord(JNIEnv *env, jobject thiz) {
    if (audioRecorder == nullptr) {
        audioRecorder = new AudioRecorder();
        audioRecorder->startRecord();
    }
}
extern "C"
JNIEXPORT void JNICALL
Java_com_bj_gxz_pcmplay_AudioRecorder_stopRecord(JNIEnv *env, jobject thiz) {
    if (audioRecorder != nullptr) {
        audioRecorder->stopRecord();
    }
}
extern "C"
JNIEXPORT void JNICALL
Java_com_bj_gxz_pcmplay_AudioRecorder_release(JNIEnv *env, jobject thiz) {
    if (audioRecorder != nullptr) {
        audioRecorder->release();
        delete audioRecorder;
        audioRecorder = nullptr;
    }
}

Corresponding java layer:

/**
 * Created by guxiuzhong on 2021/06/05 2:03 afternoon
 */
public class AudioRecorder {
    static {
        System.loadLibrary("native-lib");
    }

    public native void startRecord();

    public native void stopRecord();

    public native void release();
}

Source code

https://github.com/ta893115871/PCMPlay

summary

Learned some basic knowledge of OpenSL ES, its advantages and disadvantages, and use scenarios.

Learned the process of OpenSL ES collecting audio.

Programmer Think