Implementation of yuv420p Video Playing Interface with OpenGLES

Posted by my_r31_baby on Sun, 09 Jun 2019 23:04:24 +0200

background

Example TFLive In this project, I write a live player according to ijkPlayer. To run it, I need to compile ffmpeg libraries. A copy is saved in the net disk. Extraction code: vjce. OpenGL ES plays related in OpenGLES It's in the folder.

learnOpenGL Learn how to use texture.

Playing video is to display the pictures one by one, just like frame animation. After decoding the video frame data, we get a piece of memory in a certain format, which constitutes the color information needed for a picture, such as yuv420p. Detailed explanation of YUV420 data format This article is well written.

YUV and RGB are called color spaces. My understanding is that they are arranged according to the agreed color values. For example, RGB, is red, green and blue three color components arranged in sequence, generally each color component takes up a byte, the value is 0-255.

YUV420p, YUV three components are three layers, like: YYYYUUVV. Y is all together, and RGB is a mixture of RGBRGBRGB. Each component is separated by a Plane. The 420 style is a combination of four Y components and a pair of UV components, saving space.

To display YUV420p images, you need to convert yuv to rgba, because the OpenGL output only recognizes rgba.

Preparations on iOS

The OpenGL part has the same logic on all platforms, which can be skipped if it is not on iOS.

Use frame Buffer to display:

Create a new UIView subclass and modify layer to CAEAGLLayer:
```
+(Class)layerClass{
  return [CAEAGLLayer class];
}
```

Build Context before you start drawing:

-(BOOL)setupOpenGLContext{
  _renderLayer = (CAEAGLLayer *)self.layer;
  _renderLayer.opaque = YES;
  _renderLayer.contentsScale = [UIScreen mainScreen].scale;
  _renderLayer.drawableProperties = [NSDictionary dictionaryWithObjectsAndKeys:
                                     [NSNumber numberWithBool:NO], kEAGLDrawablePropertyRetainedBacking,
                                     kEAGLColorFormatRGBA8, kEAGLDrawablePropertyColorFormat,
                                     nil];

  _context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES3];
  //_context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2];
  if (!_context) {
      NSLog(@"alloc EAGLContext failed!");
      return false;
  }
  EAGLContext *preContex = [EAGLContext currentContext];
  if (![EAGLContext setCurrentContext:_context]) {
      NSLog(@"set current EAGLContext failed!");
      return false;
  }
  [self setupFrameBuffer];

  [EAGLContext setCurrentContext:preContex];
  return true;
}

opaque is set to YES to eliminate unnecessary performance consumption without layer mixing.
contentsScale keeps the same as the mobile home screen and adapts to different mobile phones.
KEAGLDrawable Property Retained Backing saves the data after rendering for YES. We don't need this. It's useless when a frame of video data is displayed, so this function is turned off and unnecessary performance consumption is eliminated.

With this context and set it to CurrentContext, the OpenGL code in the drawing process can take effect in this context and output the results to the desired place.

Build the frameBuffer, which is the output:

-(void)setupFrameBuffer{
  glGenBuffers(1, &_frameBuffer);
  glBindFramebuffer(GL_FRAMEBUFFER, _frameBuffer);

  glGenRenderbuffers(1, &_colorBuffer);
  glBindRenderbuffer(GL_RENDERBUFFER, _colorBuffer);
  [_context renderbufferStorage:GL_RENDERBUFFER fromDrawable:_renderLayer];
  glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, _colorBuffer);

  GLint width,height;
  glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_WIDTH, &width);
  glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_HEIGHT, &height);

  _bufferSize.width = width;
  _bufferSize.height = height;

  glViewport(0, 0, _bufferSize.width, _bufferSize.height);

  GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER) ;
  if(status != GL_FRAMEBUFFER_COMPLETE) {
      NSLog(@"failed to make complete framebuffer object %x", status);
  }
}

Build a framebuffer
Build a renderBuffer that stores colors, but its memory is allocated by context: [_context renderbuffer Storage: GL_RENDERBUFFER from Drawable:_renderLayer]; this sentence is more critical. Because of it, renderBuffer, context and layer are linked together. According to the Apple document, the layers and renderbuffers responsible for displaying are shared memory, so that the layers are displayed only when the contents are exported to the renderBuffer.

OpenGL section

There are two parts: data preparation before the first drawing and each drawing cycle.

Preparatory section

The logic of using OpenGL is to draw a square, then make the output video frame data into texture to the square, and process the texture display in OK.

So the drawing is unchanged, so the shader and data (AVO, etc.) are fixed and need not be changed before the first time.

    if (!_renderConfiged) {
        [self configRenderData];
    }

-(BOOL)configRenderData{
    if (_renderConfiged) {
        return true;
    }

    GLfloat vertices[] = {
        -1.0f, 1.0f, 0.0f, 0.0f, 0.0f,  //left top
        -1.0f, -1.0f, 0.0f, 0.0f, 1.0f, //left bottom
        1.0f, 1.0f, 0.0f, 1.0f, 0.0f,   //right top
        1.0f, -1.0f, 0.0f, 1.0f, 1.0f,  //right bottom
    };

//    NSString *vertexPath = [[NSBundle mainBundle] pathForResource:@"frameDisplay" ofType:@"vs"];
//    NSString *fragmentPath = [[NSBundle mainBundle] pathForResource:@"frameDisplay" ofType:@"fs"];
    //_frameProgram = new TFOPGLProgram(std::string([vertexPath UTF8String]), std::string([fragmentPath UTF8String]));
    _frameProgram = new TFOPGLProgram(TFVideoDisplay_common_vs, TFVideoDisplay_yuv420_fs);

    glGenVertexArrays(1, &VAO);
    glBindVertexArray(VAO);

    glGenBuffers(1, &VBO);
    glBindBuffer(GL_ARRAY_BUFFER, VBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);

    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 5*sizeof(GL_FLOAT), 0);
    glEnableVertexAttribArray(0);

    glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 5*sizeof(GL_FLOAT), (void*)(3*(sizeof(GL_FLOAT))));
    glEnableVertexAttribArray(1);

    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glBindVertexArray(0);


    //gen textures
    glGenTextures(TFMAX_TEXTURE_COUNT, textures);
    for (int i = 0; i<TFMAX_TEXTURE_COUNT; i++) {
        glBindTexture(GL_TEXTURE_2D, textures[i]);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    }
    _renderConfiged = YES;

    return YES;
}

vertices is the vertex coordinate data of four corners of a square, each point has five float s, the first three are xyz coordinates, and the last two are texture coordinates (uv). xyz range [-1, 1], uv range [0, 1].
Loading shader, compiling and linking program s are all done in the TFOPGLProgram class.
Then generate a VAO and VBO binding data.
Finally, several textures are constructed, although there is no data at this time, which takes the place first.

Draw

shader first:

const GLchar *TFVideoDisplay_common_vs ="               \n\
#version 300 es                                         \n\
                                                        \n\
layout (location = 0) in highp vec3 position;           \n\
layout (location = 1) in highp vec2 inTexcoord;         \n\
                                                        \n\
out highp vec2 texcoord;                                \n\
                                                        \n\
void main()                                             \n\
{                                                       \n\
gl_Position = vec4(position, 1.0);                      \n\
texcoord = inTexcoord;                                  \n\
}                                                       \n\
";

const GLchar *TFVideoDisplay_yuv420_fs ="               \n\
#version 300 es                                         \n\
precision highp float;                                  \n\
                                                        \n\
in vec2 texcoord;                                       \n\
out vec4 FragColor;                                     \n\
uniform lowp sampler2D yPlaneTex;                       \n\
uniform lowp sampler2D uPlaneTex;                       \n\
uniform lowp sampler2D vPlaneTex;                       \n\
                                                        \n\
void main()                                             \n\
{                                                       \n\
    // (1) y - 16 (2) rgb * 1.164                       \n\
    vec3 yuv;                                           \n\
    yuv.x = texture(yPlaneTex, texcoord).r;             \n\
    yuv.y = texture(uPlaneTex, texcoord).r - 0.5f;      \n\
    yuv.z = texture(vPlaneTex, texcoord).r - 0.5f;      \n\
                                                        \n\
    mat3 trans = mat3(1, 1 ,1,                          \n\
                      0, -0.34414, 1.772,               \n\
                      1.402, -0.71414, 0                \n\
                      );                                \n\
                                                        \n\
    FragColor = vec4(trans*yuv, 1.0);                   \n\
}                                                       \n\
";

vertex shader outputs gl_Position and passes texture coordinates to fragment shader.
fragment shader is the focus, because the transition from yuv to rgb is done here.
Because yuv420p is layered with three components of yuv, if the whole YUV data is loaded as the whole texture, it will be more difficult to calculate three components with one texture coordinate, and each fragment needs to be calculated.
YyYYYYYY
YYYYYYYY
uUUUvVVV
yuv420p looks like this. Add the color information of the coordinate you want (2, 1), then y is in (2, 1), u is in (1, 3), v is in (5, 3). And the ratio of height to width will affect the layout:
YyYYYYYY
YYYYYYYY
YyYYYYYY
YYYYYYYY
uUUUuUUU
vVVVvVVV
So the uv is not on the same line.

So we use a single texture for each component. What's so powerful is that they can share the same texture coordinates:

glBindTexture(GL_TEXTURE_2D, textures[0]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, width, height, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, overlay->pixels[0]);
    glGenerateMipmap(GL_TEXTURE_2D);

    glBindTexture(GL_TEXTURE_2D, textures[1]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, width/2, height/2, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, overlay->pixels[1]);
    glGenerateMipmap(GL_TEXTURE_2D);

    glBindTexture(GL_TEXTURE_2D, textures[2]);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, width/2, height/2, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, overlay->pixels[2]);
    glGenerateMipmap(GL_TEXTURE_2D);

Three textures, y texture and image size are the same, the width and height of u and v are halved.
overlay is only a structure used to package video frame data. 0, 1 and 2 of pixels are the starting positions of the three components of YUV respectively.
One key point is that the texture format uses GL_LUMINANCE, which is a single color channel. Look at the example on the internet. What I wrote before is that GL_RED is not good.
Because the power coordinate is a relative coordinate, it is mapped to the range of [0,1]. So for the texture coordinates [x, y], the points on the u and v textures correspond to the points on the Y texture coordinates [2x, 2y], which is exactly what yuv420 needs: four y correspond to a set of uv.

Finally, we use the formula to convert yuv into rgb.

R = Y + 1.402 (Cr-128)
G = Y - 0.34414 (Cb-128) - 0.71414 (Cr-128)
B = Y + 1.772 (Cb-128)

Another thing to note here is the difference between YUV and YCrCb:
YCrCb is an offset version of YUV, so you need to subtract 0.5 (because it maps to 0-1 range 128 is 0.5). Of course, I think this formula still depends on what format is set when encoding and how rgb is converted to YUV when video shooting. The two are matched by ok!

Draw a square

glBindFramebuffer(GL_FRAMEBUFFER, self.frameBuffer);
    glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);

    _frameProgram->use();

    _frameProgram->setTexture("yPlaneTex", GL_TEXTURE_2D, textures[0], 0);
    _frameProgram->setTexture("uPlaneTex", GL_TEXTURE_2D, textures[1], 1);
    _frameProgram->setTexture("vPlaneTex", GL_TEXTURE_2D, textures[2], 2);

    glBindVertexArray(VAO);

    glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);


    glBindRenderbuffer(GL_RENDERBUFFER, self.colorBuffer);
    [self.context presentRenderbuffer:GL_RENDERBUFFER];

Open the program and input three textures
Drawing with GL_TRIANGLE_STRIP makes it simpler. Using GL_TRIANGLES, you have two triangles. Because of this, the four vertices are in the order of top left, bottom left, top right and bottom right. [OpenGL] Understanding Three Ways of Drawing Triangle Sequences, such as GL_TRIANGLE_STRIP.

Detail Processing

Monitor the front and back switching of app, and don't render the background.

[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(catchAppResignActive) name:UIApplicationWillResignActiveNotification object:nil];
[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(catchAppBecomeActive) name:UIApplicationDidBecomeActiveNotification object:nil];
......
-(void)catchAppResignActive{
    _appIsUnactive = YES;
}

-(void)catchAppBecomeActive{
    _appIsUnactive = NO;
}
.......
if (self.appIsUnactive) {
    return;    //Check before drawing, cancel directly
}

Move the drawing to a sub-thread
These manipulations of OpenGL ES in iOS can all be processed by sub-threads, including the final presentRenderbuffer. The key is context construction, array preparation (VAO texture, etc.) and rendering, which need to be done in one thread. Of course, it can also be multithreaded. But it is not necessary for video playback. It is not necessary to remove unnecessary performance consumption. Locks need not be added.
frame Change Processing of layer

-(void)layoutSubviews{
    [super layoutSubviews];

    //If context has setuped and layer's size has changed, realloc renderBuffer.
    if (self.context && !CGSizeEqualToSize(self.layer.frame.size, self.bufferSize)) {
 _needReallocRenderBuffer = YES;
    }
}
...........
if (_needReallocRenderBuffer) {
   [self reallocRenderBuffer];
   _needReallocRenderBuffer = NO;
}
.........
-(void)reallocRenderBuffer{
    glBindRenderbuffer(GL_RENDERBUFFER, _colorBuffer);

    [_context renderbufferStorage:GL_RENDERBUFFER fromDrawable:_renderLayer];
    glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, _colorBuffer);
    ......
}

After the change, redistribute the memory of render buffer
In order to process in the same thread, render buffer is not redistributed directly in layoutSubviews, which must be the main thread. So I just made a mark.
In the rendering method, first look at _needReallocRenderBuffer, then realloc render buffer.

Last

The emphasis is on the reading of yuv components in fragment shader:

Take three textures
Using the same texture coordinates
To construct texture, GL_LUMINANCE is used, and the width and height of u and v textures are halved relative to y.

Topics: Fragment iOS Mobile encoding

Programmer Think