【FFmpeg实战】视频解码流程

原文链接：https://blog.csdn.net/weekend_y45/article/details/125168344

一、解码流程使用到的FFmpeg结构体说明

1、AVFormatContext结构体

该结构体描述了一个媒体文件或媒体流的构成和基本信息。它是一个贯穿始终的数据结构，很多函数调用需要使用到它。它也是FFMPEG解封装（flv，avi，mp4）功能的结构体。

其主要的几个变量（主要考虑解码情况说明）：

struct AVInputFormat *iformat;//输入数据的封装格式。仅解封装用，由avformat_open_input()设置。
struct AVOutputFormat *oformat;//输出数据的封装格式。仅封装用，调用者在avformat_write_header()之前设置。
AVIOContext *pb;// I/O上下文。
解封装：由用户在avformat_open_input()之前设置（然后用户必须手动关闭它）或通过avformat_open_input()设置。
封装：由用户在avformat_write_header()之前设置。 调用者必须注意关闭/释放IO上下文。
 
unsigned int nb_streams;//AVFormatContext.streams中元素的个数。
AVStream **streams;//文件中所有流的列表。char filename[1024];//输入输出文件名。
 
int64_t start_time;//第一帧的位置。
int64_t duration;//流的持续时间
int64_t bit_rate;//总流比特率（bit / s），如果不可用则为0。 
int64_t probesize;
//从输入读取的用于确定输入容器格式的数据的最大大小。
仅封装用，由调用者在avformat_open_input()之前设置。
AVDictionary *metadata;//元数据
AVCodec *video_codec;//视频编解码器
AVCodec *audio_codec;//音频编解码器
AVCodec *subtitle_codec;//字母编解码器
AVCodec *data_codec;//数据编解码器
int (*io_open)(struct AVFormatContext *s, AVIOContext **pb, const char *url, int flags, AVDictionary **options);
//打开IO stream的回调函数。
void (*io_close)(struct AVFormatContext *s, AVIOContext *pb);
//关闭使用AVFormatContext.io_open()打开的流的回调函数。

使用时可以通过avformat_alloc_context分配后使用，也可以直接avformat_open_input。

//1、方法一
AVFormatContext *fmt_ctx = NULL;
string filename = "test.avi" ;
fmt_ctx = avformat_alloc_context();
avformat_open_input(&fmt_ctx, ilename.c_str(), NULL, NULL);
avformat_close_input(&fmt_ctx);
 
//2、方法二
AVFormatContext *fmt_ctx = NULL;
string filename = "test.avi" ;
int ret = avformat_open_input(&fmt_ctx, filename.c_str(), NULL, NULL);
avformat_close_input(&fmt_ctx);

推荐使用方法2，因为若传进avformat_open_input的fmt_ctx为NULL，该函数内部会调用avformat_alloc_context函数。相应的avformat_close_input内部会调用avformat_free_context。

2、AVCodec

ffmpeg中的解码器及编码器都用AVCodec结构体保存一些编解码的配置信息。

对解码来说可以按照下面方式使用。

   //解码H264流  
  AVCodec*   Vcodec = NULL;
  Vcodec = avcodec_find_decoder(AV_CODEC_ID_H264);
  
  //或者直接通过解码器名字找到解码器 
  Vcodec = avcodec_find_decoder_by_name("h264_mediacodec");

3、AVCodecContext

该结构体用于存储编解码器上下文的数据结构，包含了众多编解码需要的参数信息。这些信息参数需要进行初始化，使用avcodec_parameters_to_context进行初始化。不初始化解析一些格式的封装视频会导致编解码失败。该结构体内很多参数是编码时使用的，解码用不上。

几个主要的成员：

enum AVMediaType codec_type：编解码器的类型（视频，音频...）
 
struct AVCodec  *codec：采用的解码器AVCodec（H.264,MPEG2...）
 
int bit_rate：平均比特率
 
uint8_t *extradata; int extradata_size：针对特定编码器包含的附加信息（例如对于H.264解码器来说，存储SPS，PPS等）
 
AVRational time_base：根据该参数，可以把PTS转化为实际的时间（单位为秒s）
 
int width, height：如果是视频的话，代表宽和高
 
int refs：运动估计参考帧的个数（H.264的话会有多帧，MPEG2这类的一般就没有了）
 
int sample_rate：采样率（音频）
 
int channels：声道数（音频）
 
enum AVSampleFormat sample_fmt：采样格式
 
int profile：型（H.264里面就有，其他编码标准应该也有）
 
int level：级（和profile差不太多）

使用：

AVCodec*   Vcodec = NULL;
Vcodec = avcodec_find_decoder(AV_CODEC_ID_H264);
AVCodecContext*     AvContext = NULL;
AvContext = avcodec_alloc_context3(mVcodec);
avcodec_parameters_to_context(mAvContext, 
fmt_ctx->streams[mVideoStreamIdx]->codecpar);

4、AVStream

该结构体用于描述一个流媒体，该结构体中大部分值域可以由avformat_open_input函数根据文件头的信息确定，缺少的信息需要通过调用av_find_stream_info进一步获得。

av_find_stream_info函数读取一部分音视频来获取有关视频文件的一些信息，如编码宽高、视频时长等。对于一些没有头部信息的视频文件（如mpeg编码的文件）调用该函数是必须的。调用该函数可能会带了很大的延迟。

主要的成员：

index/id：index对应流的索引，这个数字是自动生成的，根据index可以从AVFormatContext::streams表中索引到该流；而id则是流的标识，依赖于具体的容器格式。比如对于MPEG TS格式，id就是pid。
 
time_base：流的时间基准，是一个实数，该流中媒体数据的pts和dts都将以这个时间基准为粒度。通常，使用av_rescale/av_rescale_q可以实现不同时间基准的转换。
 
start_time：流的起始时间，以流的时间基准为单位，通常是该流中第一个帧的pts。
 
duration：流的总时间，以流的时间基准为单位。
 
need_parsing：对该流parsing过程的控制域。
 
nb_frames：流内的帧数目。
 
avg_frame_rate：帧率相关。
 
codec：指向该流对应的AVCodecContext结构，调用avformat_open_input时生成。
 
parser：指向该流对应的AVCodecParserContext结构，调用av_find_stream_info时生成。

5、AVIOContext

用于管理FFMPEG输入输出数据的结构体。

主要成员：

unsigned char *buffer：缓存开始位置
 
int buffer_size：缓存大小（默认32768）
 
unsigned char *buf_ptr：当前指针读取到的位置
 
unsigned char *buf_end：缓存结束的位置
 
void *opaque：URLContext结构体

在解码的情况下，buffer用于存储ffmpeg读入的数据。如打开一个视频文件时，先把数据从硬盘读入buffer，然后在送给解码器解码。

URLContext结构体中有一个URLProtocol。每种协议（rtp,rtmp,file，udp等）都有一个对应的URLProtocol。

6、AVPacket

该结构体是ffmpeg中很重要的一个结构体，它保存了解码后或编码前的数据（仍然是压缩数据）和这些数据的一些附加信息，如显示时间戳(pts)、数据时长、所在媒体的索引等。

对于视频来说，一个AVPacket通常包含一帧压缩数据，而音频则有可能包含多个压缩的Frame。

重要的成员变量：

uint8_t *data：压缩编码的数据。
//例如对于H.264来说。1个AVPacket的data通常对应一个NAL。
注意：在这里只是对应，而不是一模一样。他们之间有微小的差别：使用FFMPEG类库分离出多媒体文件中的H.264码流.因此在使用FFMPEG进行视音频处理的时候，常常可以将得到的AVPacket的data数据直接写成文件，从而得到视音频的码流文件。
 
 
int size：data的大小
int64_t pts：显示时间戳
int64_t dts：解码时间戳
int stream_index：标识该AVPacket所属的视频/音频流。

avpacket.h内有API说明，常用的几个API

av_packet_ref,av_packet_unref

av_new_packet， av_packet_alloc, av_init_packet, av_packet_unref,av_packet_free(free这个API为旧接口)

av_packet_clone：拷贝packet

7、AVFrame

AVFrame结构体一般用于存储原始数据（非压缩的YUV，RGB数据等），此外还包含一些相关信息，比如解码的时候存储宏块类型表，QP表，运动矢量等数据。

AVFrame必须用av_frame_alloc分配，用av_frame_free释放。注意av_frame_alloc函数只创建实例但是该实例存储数据的buffer则需要通过另外的操作进行分配，如av_image_fill_arrays。

几个常用变量：

uint8_t *data[AV_NUM_DATA_POINTERS]：解码后原始数据（对视频来说是YUV，RGB，对音频来说是PCM）
 
int linesize[AV_NUM_DATA_POINTERS]：data中“一行”数据的大小。注意：未必等于图像的宽，一般大于图像的宽。
 
int width, height：视频帧宽和高（1920x1080,1280x720…）
 
int nb_samples：音频的一个AVFrame中可能包含多个音频帧，在此标记包含了几个
 
int format：解码后原始数据类型（YUV420，YUV422，RGB24…）
 
int key_frame：是否是关键帧
 
enum AVPictureType pict_type：帧类型（I,B,P…）
 
AVRational sample_aspect_ratio：宽高比（16:9，4:3…）
 
int64_t pts：显示时间戳
 
int coded_picture_number：编码帧序号
 
int display_picture_number：显示帧序号
 
int interlaced_frame：是否是隔行扫描
 
uint8_t motion_subsample_log2：一个宏块中的运动矢量采样个数，取log的

二、解码过程的API调用流程

1、API调用流程图：

这里主要针对解码新接口：

在这里插入图片描述

2、解码并生成YUV文件过程API调用说明

（1）、注册各大组件

    //注册各大组件
    av_register_all();

（2）、打开视频文件并获取相关上下文

在解码之前我们得获取里面的内容，这一步就是打开地址并且获取里面的内容。其中avFormatContext是内容的一个上下文。

并使用avformat_open_input打开播放源，inputPath为输入的地址，可以是视频文件，也可以是网络视频流。然后使用avformat_find_stream_info从获取的内容中寻找相关流。

    AVFormatContext *avFormatContext = avformat_alloc_context();    //获取上下文
    //打开视频地址并获取里面的内容(解封装)
    if (avformat_open_input(&avFormatContext, inputPath, NULL, NULL) < 0) {
        LOGE("打开视频失败")
        return;
    }
    if (avformat_find_stream_info(avFormatContext, NULL) < 0) {
        LOGE("获取内容失败")
        return;
    }

（3）、寻找视频流

我们在上面已经获取了内容，但是在一个音视频中包括了音频流，视频流和字幕流，所以在所有的内容当中，我们应当找出相对应的视频流。

    //获取视频的编码信息
    AVCodecParameters *origin_par = NULL;
    int mVideoStreamIdx = -1;
    mVideoStreamIdx = av_find_best_stream(avFormatContext, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);
    if (mVideoStreamIdx < 0) {
        av_log(NULL, AV_LOG_ERROR, "Can't find video stream in input file\n");
        return;
    }
    LOGE("成功找到视频流")

（4）、获取并打开解码器

如果要进行解码，那么得有解码器并打开解码器。avcodec_parameters_to_context去初始化解码器，否则解析avi封装的mpeg4视频没问题但是解析MP4封装的mpeg4视频会报错。如下：

    // 寻找解码器 {start
    AVCodec *mVcodec = NULL;
    AVCodecContext *mAvContext = NULL;
    mVcodec = avcodec_find_decoder(origin_par->codec_id);
    mAvContext = avcodec_alloc_context3(mVcodec);
    if (!mVcodec || !mAvContext) {
        return;
    }
 
 
    //不初始化解码器context会导致MP4封装的mpeg4码流解码失败
    int ret = avcodec_parameters_to_context(mAvContext, origin_par);
    if (ret < 0) {
        av_log(NULL, AV_LOG_ERROR, "Error initializing the decoder context.\n");
    }
 
    // 打开解码器
    if (avcodec_open2(mAvContext, mVcodec, NULL) != 0){
        LOGE("打开失败")
        return;
    }
    LOGE("解码器打开成功")
    // 寻找解码器 end}

（5）、申请AVPacket和AVFrame以及相关设置

申请AVPacket和AVFrame，其中AVPacket的作用是：保存解码之前的数据和一些附加信息，如显示时间戳（pts）、解码时间戳（dts）、数据时长，所在媒体流的索引等；AVFrame的作用是：存放解码过后的数据。

  //申请AVPacket
    AVPacket *packet = (AVPacket *) av_malloc(sizeof(AVPacket));
    av_init_packet(packet);
    //申请AVFrame
    AVFrame *frame = av_frame_alloc();//分配一个AVFrame结构体,AVFrame结构体一般用于存储原始数据，指向解码后的原始帧

（6）、申请用于存放解码后YUV格式数据的相关buf

解码后的数据，按照YUV相关格式保存为yuv文件，先申请buf，用于存放解码后的数据，并按照yuv格式排列保存：

     uint8_t *byte_buffer = NULL;
 
    int byte_buffer_size = av_image_get_buffer_size(mAvContext->pix_fmt, mAvContext->width, mAvContext->height, 32);
    LOGE("width = %d , height = %d ",mAvContext->width, mAvContext->height);
    byte_buffer = (uint8_t*)av_malloc(byte_buffer_size);
    if (!byte_buffer) {
        av_log(NULL, AV_LOG_ERROR, "Can't allocate buffer\n");
        return AVERROR(ENOMEM);
    }

（7）、开始解码

接下来就可以开始解码，如下是解码的核心段代码：

    // 发送待解码包
    int result = avcodec_send_packet(mAvContext, packet);
    av_packet_unref(packet);
    if (result < 0) {
        av_log(NULL, AV_LOG_ERROR, "Error submitting a packet for decoding\n");
        continue;
    }
 
    // 接收解码数据
    while (result >= 0) {
        result = avcodec_receive_frame(mAvContext, frame);
        if (result == AVERROR_EOF)
             break;
        else if (result == AVERROR(EAGAIN)) {
             result = 0;
             break;
        } else if (result < 0) {
             av_log(NULL, AV_LOG_ERROR, "Error decoding frame\n");
             av_frame_unref(frame);
             break;
        }
        av_frame_unref(frame);
    }

（8）、解码并保存为yuv格式文件

YUV是解码后的纯视频原数据格式，我们将其保存为yuv文件，用于分析解码后的数据是否正确。完整的解码，赋值，然后保存文件代码如下：

    while(1)
    {
        int ret = av_read_frame(avFormatContext, packet);
        if (ret != 0){
            av_strerror(ret,buf,sizeof(buf));
            LOGE("--%s--\n",buf);
            av_packet_unref(packet);
            break;
        }
 
        if (ret >= 0 && packet->stream_index != mVideoStreamIdx) {
            av_packet_unref(packet);
            continue;
        }
 
        {
            // 发送待解码包
            int result = avcodec_send_packet(mAvContext, packet);
            av_packet_unref(packet);
            if (result < 0) {
                av_log(NULL, AV_LOG_ERROR, "Error submitting a packet for decoding\n");
                continue;
            }
 
            // 接收解码数据
            while (result >= 0){
                result = avcodec_receive_frame(mAvContext, frame);
                if (result == AVERROR_EOF)
                    break;
                else if (result == AVERROR(EAGAIN)) {
                    result = 0;
                    break;
                } else if (result < 0) {
                    av_log(NULL, AV_LOG_ERROR, "Error decoding frame\n");
                    av_frame_unref(frame);
                    break;
                }
 
                int number_of_written_bytes = av_image_copy_to_buffer(byte_buffer, byte_buffer_size,
                                                                      (const uint8_t* const *)frame->data, (const int*) frame->linesize,
                                                                      mAvContext->pix_fmt, mAvContext->width, mAvContext->height, 1);
                if (number_of_written_bytes < 0) {
                    av_log(NULL, AV_LOG_ERROR, "Can't copy image to buffer\n");
                    av_frame_unref(frame);
                    continue;
                }
 
                // 写文件保存视频数据
                fwrite(byte_buffer, number_of_written_bytes, 1, fp_YUV);
                fflush(fp_YUV);
 
                av_frame_unref(frame);
            }
        }
 
    }

（9）、收尾释放资源

完成过后记得释放资源。

    //释放
    fclose(fp_YUV);
    av_frame_free(&frame);
    avcodec_close(mAvContext);
    avformat_free_context(avFormatContext);

  >>> 音视频开发 视频教程： https://ke.qq.com/course/3202131?flowToken=1031864 
  >>> 音视频开发学习资料、教学视频，免费分享有需要的可以自行添加学习交流群： 739729163  领取