ffplay source code analysis 3- code framework

Posted by KingPhilip on Mon, 03 Jan 2022 07:57:12 +0100

3. Code framework

This section briefly combs ffplay C code framework. Some key issues and details are discussed in the following chapters.

3.1 flow chart

3.2 main thread

The main thread mainly realizes three functions: video playback (audio and video synchronization), subtitle playback and SDL message processing.

After the main thread performs some necessary initialization and creates a demultiplexing thread, it enters event_loop() main loop to handle video playback and SDL message events:

main() -->
static void event_loop(VideoState *cur_stream)
{
    SDL_Event event;
    ......

    for (;;) {
        // If the SDL event queue is empty, the video frame is played in the while loop. Otherwise, take an event from the queue head, exit the current function and process the event in the superior function
        refresh_loop_wait_event(cur_stream, &event);
        // SDL event handling
        switch (event.type) {
        case SDL_KEYDOWN:
            switch (event.key.keysym.sym) {
            case SDLK_f:            // f key: forced refresh
                ......
                break;
            case SDLK_p:            // p Key
            case SDLK_SPACE:        // Spacebar: pause
                ......
            case SDLK_s:            // s key: play frame by frame
                ......
                break;
            ......
        ......
        }
    }
}

3.2.1 video playback

The main code is in refresh_loop_wait_event() function, as follows:

static void refresh_loop_wait_event(VideoState *is, SDL_Event *event) {
    double remaining_time = 0.0;
    SDL_PumpEvents();
    while (!SDL_PeepEvents(event, 1, SDL_GETEVENT, SDL_FIRSTEVENT, SDL_LASTEVENT)) {
        if (!cursor_hidden && av_gettime_relative() - cursor_last_shown > CURSOR_HIDE_DELAY) {
            SDL_ShowCursor(0);
            cursor_hidden = 1;
        }
        if (remaining_time > 0.0)
            av_usleep((int64_t)(remaining_time * 1000000.0));
        remaining_time = REFRESH_RATE;
        if (is->show_mode != SHOW_MODE_NONE && (!is->paused || is->force_refresh))
            // Display the current frame immediately, or delay remaining_ Display after time
            video_refresh(is, &remaining_time);
        SDL_PumpEvents();
    }
}

The while() statement indicates that if the SDL event queue is empty, the video frame will be played in the while loop; Otherwise, take an event from the queue head, exit the current function and process the event in the superior function.
refresh_ loop_ wait_ The key function called video_ is called in event (). refresh()，video_ The refresh() function realizes the synchronization of audio and video and the display of video frames. It is ffplay One of the core functions in C, which is analyzed in detail in "section 4.3 video synchronization to audio".

3.2.2 SDL message processing

Handle various SDL messages, such as pause, forced refresh and other key events. It's simple.

main() -->
static void event_loop(VideoState *cur_stream)
{
    SDL_Event event;
    ......

    for (;;) {
        // If the SDL event queue is empty, the video frame is played in the while loop. Otherwise, take an event from the queue head, exit the current function and process the event in the superior function
        refresh_loop_wait_event(cur_stream, &event);
        // SDL event handling
        switch (event.type) {
        case SDL_KEYDOWN:
            switch (event.key.keysym.sym) {
            case SDLK_f:            // f key: forced refresh
                ......
                break;
            case SDLK_p:            // p Key
            case SDLK_SPACE:        // Spacebar: pause
                ......
                break;
            ......
        ......
        }
    }
}

3.3 demultiplexing threads

The demultiplexing thread reads the video file and stores the retrieved packets in different packet queues according to the type (audio, video, subtitle).
To save space, use "..." for the source code of non key contents in the following source code replace. Code flow reference notes.

/* this thread gets the stream from the disk or the network */
static int read_thread(void *arg)
{
    VideoState *is = arg;
    AVFormatContext *ic = NULL;
    int st_index[AVMEDIA_TYPE_NB];
    ......

    ......
    
    // Interrupt callback mechanism. Provide a processing interface for the underlying I/O layer, such as aborting IO operations.
    ic->interrupt_callback.callback = decode_interrupt_cb;
    ic->interrupt_callback.opaque = is;
    if (!av_dict_get(format_opts, "scan_all_pmts", NULL, AV_DICT_MATCH_CASE)) {
        av_dict_set(&format_opts, "scan_all_pmts", "1", AV_DICT_DONT_OVERWRITE);
        scan_all_pmts_set = 1;
    }
    // 1. Build AVFormatContext
    // 1.1 open video file: read the file header and store the file format information in "fmt context"
    err = avformat_open_input(&ic, is->filename, is->iformat, &format_opts);
    
    ......

    if (find_stream_info) {
        ......
        // 1.2 search for stream information: read a video file data, try decoding, and fill the obtained stream information into IC - > streams
        //     IC - > streams is an array of pointers. The array size is IC - > NB_ streams
        err = avformat_find_stream_info(ic, opts);
        ......
    }

    ......

    // 2. Find a stream for decoding processing
    // 2.1 the corresponding stream_index save st_index [] array
    if (!video_disable)
        st_index[AVMEDIA_TYPE_VIDEO] =          // Video stream
            av_find_best_stream(ic, AVMEDIA_TYPE_VIDEO,
                                st_index[AVMEDIA_TYPE_VIDEO], -1, NULL, 0);
    if (!audio_disable)
        st_index[AVMEDIA_TYPE_AUDIO] =          // Audio stream
            av_find_best_stream(ic, AVMEDIA_TYPE_AUDIO,
                                st_index[AVMEDIA_TYPE_AUDIO],
                                st_index[AVMEDIA_TYPE_VIDEO],
                                NULL, 0);
    if (!video_disable && !subtitle_disable)
        st_index[AVMEDIA_TYPE_SUBTITLE] =       // Subtitle stream
            av_find_best_stream(ic, AVMEDIA_TYPE_SUBTITLE,
                                st_index[AVMEDIA_TYPE_SUBTITLE],
                                (st_index[AVMEDIA_TYPE_AUDIO] >= 0 ?
                                 st_index[AVMEDIA_TYPE_AUDIO] :
                                 st_index[AVMEDIA_TYPE_VIDEO]),
                                NULL, 0);

    is->show_mode = show_mode;
    // 2.2 obtain relevant parameters from the stream to be processed and set the width, height and aspect ratio of the display window
    if (st_index[AVMEDIA_TYPE_VIDEO] >= 0) {
        AVStream *st = ic->streams[st_index[AVMEDIA_TYPE_VIDEO]];
        AVCodecParameters *codecpar = st->codecpar;
        // Guess the sample aspect ratio of the frame according to the stream and frame aspect ratio.
        // Since the frame aspect ratio is set by the decoder, but the stream aspect ratio is set by the demultiplexer, the two may not be equal. This function attempts to return the aspect ratio that should be used for the frame to be displayed.
        // The basic logic is to use the stream aspect ratio first (provided that the value is reasonable), and then use the frame aspect ratio. In this way, the flow aspect ratio (container setting, easy to modify) can override the frame aspect ratio.
        AVRational sar = av_guess_sample_aspect_ratio(ic, st, NULL);
        if (codecpar->width)
            // Sets the size and aspect ratio of the display window
            set_default_window_size(codecpar->width, codecpar->height, sar);
    }

    // 3. Create a decoding thread corresponding to the stream
    /* open the streams */
    if (st_index[AVMEDIA_TYPE_AUDIO] >= 0) {
        // 3.1 create audio decoding thread
        stream_component_open(is, st_index[AVMEDIA_TYPE_AUDIO]);
    }

    ret = -1;
    if (st_index[AVMEDIA_TYPE_VIDEO] >= 0) {
        // 3.2 create video decoding thread
        ret = stream_component_open(is, st_index[AVMEDIA_TYPE_VIDEO]);
    }
    if (is->show_mode == SHOW_MODE_NONE)
        is->show_mode = ret >= 0 ? SHOW_MODE_VIDEO : SHOW_MODE_RDFT;

    if (st_index[AVMEDIA_TYPE_SUBTITLE] >= 0) {
        // 3.3 create caption decoding thread
        stream_component_open(is, st_index[AVMEDIA_TYPE_SUBTITLE]);
    }

    ......

    // 4. Demultiplexing
    for (;;) {
        // stop it
        ......
        
        // Pause / resume
        ......
        
        // seek operation
        ......

        ......
        
        // 4.1 read a packet from the input file
        ret = av_read_frame(ic, pkt);
        if (ret < 0) {
            if ((ret == AVERROR_EOF || avio_feof(ic->pb)) && !is->eof) {
                // If the input file has been read, send a NULL packet to the packet queue to flush the decoder, otherwise the cached frames in the decoder cannot be fetched
                if (is->video_stream >= 0)
                    packet_queue_put_nullpacket(&is->videoq, is->video_stream);
                if (is->audio_stream >= 0)
                    packet_queue_put_nullpacket(&is->audioq, is->audio_stream);
                if (is->subtitle_stream >= 0)
                    packet_queue_put_nullpacket(&is->subtitleq, is->subtitle_stream);
                is->eof = 1;
            }
            if (ic->pb && ic->pb->error)    // If there is an error, exit the current thread
                break;
            SDL_LockMutex(wait_mutex);
            SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);
            SDL_UnlockMutex(wait_mutex);
            continue;
        } else {
            is->eof = 0;
        }
        // 4.2 judge whether the current packet is within the playback range. If yes, it will be listed, otherwise it will be discarded
        /* check if packet is in play range specified by user, then queue, otherwise discard */
        stream_start_time = ic->streams[pkt->stream_index]->start_time; // pts of the first frame displayed
        pkt_ts = pkt->pts == AV_NOPTS_VALUE ? pkt->dts : pkt->pts;
        // Simplify the long expression after "|":
        // [pkt_pts]  - [stream_start_time] - [start_time]                       <= [duration]
        // [current frame pts] - [first frame pts] - [first frame (seek start point) pts of current playback sequence] < = [duration]
        pkt_in_play_range = duration == AV_NOPTS_VALUE ||
                (pkt_ts - (stream_start_time != AV_NOPTS_VALUE ? stream_start_time : 0)) *
                av_q2d(ic->streams[pkt->stream_index]->time_base) -
                (double)(start_time != AV_NOPTS_VALUE ? start_time : 0) / 1000000
                <= ((double)duration / 1000000);
        // 4.3 store the current packet type (audio, video, subtitle) into the corresponding packet queue
        if (pkt->stream_index == is->audio_stream && pkt_in_play_range) {
            packet_queue_put(&is->audioq, pkt);
        } else if (pkt->stream_index == is->video_stream && pkt_in_play_range
                   && !(is->video_st->disposition & AV_DISPOSITION_ATTACHED_PIC)) {
            packet_queue_put(&is->videoq, pkt);
        } else if (pkt->stream_index == is->subtitle_stream && pkt_in_play_range) {
            packet_queue_put(&is->subtitleq, pkt);
        } else {
            av_packet_unref(pkt);
        }
    }

    ret = 0;
 fail:
    ......
    return 0;
}

The demultiplexing thread realizes the following functions:
[1]. Create audio, video and caption decoding threads
[2]. Read the packet from the input file and put it into different packet queues according to the packet type (audio, video, subtitle)

3.4 video decoding thread

The video decoding thread takes data from the video packet queue and stores it in the video frame queue after decoding.

3.4.1 video_thread()

The video decoding thread puts the decoded frames into the frame queue. In order to save space, the relevant code of filter is deleted from the following source code.

// Video decoding thread: from video packet_ Get data from the queue, decode it and put it into the video frame_queue
static int video_thread(void *arg)
{
    VideoState *is = arg;
    AVFrame *frame = av_frame_alloc();
    double pts;
    double duration;
    int ret;
    AVRational tb = is->video_st->time_base;
    AVRational frame_rate = av_guess_frame_rate(is->ic, is->video_st, NULL);

    if (!frame) {
        return AVERROR(ENOMEM);
    }

    for (;;) {
        ret = get_video_frame(is, frame);
        if (ret < 0)
            goto the_end;
        if (!ret)
            continue;
        
        // Playback duration of current frame
        duration = (frame_rate.num && frame_rate.den ? av_q2d((AVRational){frame_rate.den, frame_rate.num}) : 0);
        // Display timestamp of current frame
        pts = (frame->pts == AV_NOPTS_VALUE) ? NAN : frame->pts * av_q2d(tb);
        // Press the current frame into the frame_queue
        ret = queue_picture(is, frame, pts, duration, frame->pkt_pos, is->viddec.pkt_serial);
        av_frame_unref(frame);

        if (ret < 0)
            goto the_end;
    }
the_end:
    av_frame_free(&frame);
    return 0;
}

3.4.2 get_video_frame()

Take a packet from the packet queue, decode it to get a frame, and judge whether to discard the out of sync video frames according to the framedrop mechanism. Refer to the notes in the source code:

static int get_video_frame(VideoState *is, AVFrame *frame)
{
    int got_picture;

    if ((got_picture = decoder_decode_frame(&is->viddec, frame, NULL)) < 0)
        return -1;

    if (got_picture) {
        double dpts = NAN;

        if (frame->pts != AV_NOPTS_VALUE)
            dpts = av_q2d(is->video_st->time_base) * frame->pts;

        frame->sample_aspect_ratio = av_guess_sample_aspect_ratio(is->ic, is->video_st, frame);

        // Description of "- framedrop" option in ffplay document: 
        //   Drop video frames if video is out of sync.Enabled by default if the master clock is not set to video.
        //   Use this option to enable frame dropping for all master clock sources, use - noframedrop to disable it.
        // The "- framedrop" option is used to set whether to discard video frames when video frames lose synchronization. The "- framedrop" option changes the value of the variable framedrop in bool mode.
        // There are three audio and video synchronization methods: A to video, B to audio, and C to external clock.
        // 1) When the command line does not have "- framedrop" option or "- noframedrop", the framedrop value is the default value of - 1. If the synchronization method is "sync to video"
        //    The out of sync video frames will not be discarded, otherwise the out of sync video frames will be discarded.
        // 2) When the command line has the "- framedrop" option, the framedrop value is 1. The video frames out of synchronization will be discarded regardless of the synchronization mode.
        // 3) When the command line is provided with "- noframedrop" option, the framedrop value is 0. Video frames out of synchronization will not be discarded regardless of the synchronization mode.
        if (framedrop>0 || (framedrop && get_master_sync_type(is) != AV_SYNC_VIDEO_MASTER)) {
            if (frame->pts != AV_NOPTS_VALUE) {
                double diff = dpts - get_master_clock(is);
                if (!isnan(diff) && fabs(diff) < AV_NOSYNC_THRESHOLD &&
                    diff - is->frame_last_filter_delay < 0 &&
                    is->viddec.pkt_serial == is->vidclk.serial &&
                    is->videoq.nb_packets) {
                    is->frame_drops_early++;
                    av_frame_unref(frame);  // If the video frame loses synchronization, it will be thrown away directly
                    got_picture = 0;
                }
            }
        }
    }

    return got_picture;
}

There are two kinds of frame drop processing in ffplay. One is that the frame decoded here has not been stored in the frame queue, with is - > frame_ drops_ Early + + is a tag; The other is when the frame is read and displayed in the frame queue, with is - > frame_ drops_ Late + + is a tag.
The variables involved in the framedrop operation here are is - > frame_ last_ filter_ Delay is related to filter operation. The filter is closed by default in ffplay. Filter related operations are not considered in this paper.

3.4.3 decoder_decode_frame()

This function is a very core function, which can decode video frames and audio frames. In the video decoding thread, the actual decoding operation of video frames is carried out in this function. Refer to Section 3.2 for the analysis process.

3.5 audio decoding thread

The audio decoding thread takes data from the audio packet queue and stores it in the audio frame queue after decoding

3.5.1 turn on the audio device

The opening of the audio device is actually implemented in the demultiplexing thread. In the demultiplexing thread, first open the audio device (set the audio callback function for the SDL audio playback thread to callback), and then create the audio decoding thread. The call chain is as follows:

main() -->
stream_open() -->
read_thread() -->
stream_component_open() -->
    audio_open(is, channel_layout, nb_channels, sample_rate, &is->audio_tgt);
    decoder_start(&is->auddec, audio_thread, is);

audio_ The open() function fills in the desired audio parameters. After opening the audio device, the actual audio parameters are stored in the output parameter is - > audio_ In TGT, the audio playback thread will use this parameter later.
The parameters of audio format are strongly correlated with resampling_ The detailed implementation of open () is described in Section 5 below.

3.5.2 audio_thread()

From audio packet_ Get the data from the queue, decode it and put it into the audio frame_queue:

// Audio decoding thread: from audio packet_ Get the data from the queue, decode it and put it into the audio frame_queue
static int audio_thread(void *arg)
{
    VideoState *is = arg;
    AVFrame *frame = av_frame_alloc();
    Frame *af;
    int got_frame = 0;
    AVRational tb;
    int ret = 0;

    if (!frame)
        return AVERROR(ENOMEM);

    do {
        if ((got_frame = decoder_decode_frame(&is->auddec, frame, NULL)) < 0)
            goto the_end;

        if (got_frame) {
                tb = (AVRational){1, frame->sample_rate};

                if (!(af = frame_queue_peek_writable(&is->sampq)))
                    goto the_end;

                af->pts = (frame->pts == AV_NOPTS_VALUE) ? NAN : frame->pts * av_q2d(tb);
                af->pos = frame->pkt_pos;
                af->serial = is->auddec.pkt_serial;
                // The number of (single channel) samples / sampling rate contained in the current frame is the playback duration of the current frame
                af->duration = av_q2d((AVRational){frame->nb_samples, frame->sample_rate});

                // Copy the frame data into af - > frame, and af - > frame points to the end of the audio frame queue
                av_frame_move_ref(af->frame, frame);
                // Update audio frame queue size and case pointer
                frame_queue_push(&is->sampq);
        }
    } while (ret >= 0 || ret == AVERROR(EAGAIN) || ret == AVERROR_EOF);
 the_end:
    av_frame_free(&frame);
    return ret;
}

3.5.3 decoder_decode_frame()

This function can decode both audio and video frames. Refer to Section 3.2 for function analysis.

3.6 audio playback thread

Audio playback thread is a thread built in SDL. It calls the callback function provided by the user through callback.
Callback function in SDL_ Specified when openaudio().
Pause / resume callback process by SDL_PauseAudio() control.

3.6.1 sdl_audio_callback()

The audio callback function is as follows:

// Audio processing callback function. Read queue, get audio packet, decode and play
// This function is called by SDL on demand. This function is not in the user's main thread, so the data needs to be protected
// \Param [in] opaque parameter specified by the user when registering the callback function
// \param[out] stream audio data buffer address, and fill the decoded audio data into this buffer
// \param[out] len audio data buffer size in bytes
// After the callback function returns, the audio buffer pointed to by stream will become invalid
// The order of two channel sampling points is LR
/* prepare a new audio buffer */
static void sdl_audio_callback(void *opaque, Uint8 *stream, int len)
{
    VideoState *is = opaque;
    int audio_size, len1;

    audio_callback_time = av_gettime_relative();

    while (len > 0) {   // The input parameter len is equal to is - > audio_ hw_ buf_ Size, it's audio_ SDL audio buffer size applied in open()
        if (is->audio_buf_index >= is->audio_buf_size) {
           // 1. Take a frame from the audio frame queue and convert it to the format supported by the audio device. The return value is the size of the resampled audio frame
           audio_size = audio_decode_frame(is);
           if (audio_size < 0) {
                /* if error, just output silence */
               is->audio_buf = NULL;
               is->audio_buf_size = SDL_AUDIO_MIN_BUFFER_SIZE / is->audio_tgt.frame_size * is->audio_tgt.frame_size;
           } else {
               if (is->show_mode != SHOW_MODE_VIDEO)
                   update_sample_display(is, (int16_t *)is->audio_buf, audio_size);
               is->audio_buf_size = audio_size;
           }
           is->audio_buf_index = 0;
        }
        // Introducing is - > audio_ buf_ Function of index: prevent the size of a frame of audio data from exceeding the size of SDL audio buffer, so that a frame of data needs to be copied multiple times
        // Use is - > audio_ buf_ Index indicates the location index of the data copied into the SDL audio buffer in the resampling frame, and len1 indicates the amount of data copied this time
        len1 = is->audio_buf_size - is->audio_buf_index;
        if (len1 > len)
            len1 = len;
        // 2. Copy the converted audio data to the audio buffer stream, and then the playback is the work of the audio device driver
        if (!is->muted && is->audio_buf && is->audio_volume == SDL_MIX_MAXVOLUME)
            memcpy(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, len1);
        else {
            memset(stream, 0, len1);
            if (!is->muted && is->audio_buf)
                SDL_MixAudioFormat(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, AUDIO_S16SYS, len1, is->audio_volume);
        }
        len -= len1;
        stream += len1;
        is->audio_buf_index += len1;
    }
    // is->audio_ write_ buf_ Size is the amount of data in this frame that has not been copied into the SDL audio buffer
    is->audio_write_buf_size = is->audio_buf_size - is->audio_buf_index;
    /* Let's assume the audio driver that is used by SDL has two periods. */
    // 3. Update clock
    if (!isnan(is->audio_clock)) {
        // Update the audio clock, update time: after copying data into the sound card buffer each time
        // Front audio_ decode_ Updated is - > audio in frame_ Clock is based on audio frames, so the second parameter here is to subtract the time occupied by the amount of uncopy data
        set_clock_at(&is->audclk, is->audio_clock - (double)(2 * is->audio_hw_buf_size + is->audio_write_buf_size) / is->audio_tgt.bytes_per_sec, is->audio_clock_serial, audio_callback_time / 1000000.0);
        // Update external clock with audio clock
        sync_clock_to_slave(&is->extclk, &is->audclk);
    }
}

copy

3.6.2 audio_decode_frame()

audio_decode_frame() is mainly used for audio resampling. Take a frame from the audio frame queue. The format of this frame is the audio format in the input file. The audio device does not necessarily support these parameters, so it is necessary to convert the frame to the format supported by the audio device.
audio_ decode_ The implementation of frame () is described in Section 5 later.

3.7 subtitle decoding thread

Implementation details. When you have a chance to study subtitles in the future, you can make a supplement.

Programmer Think