FFMPEG系列之四：添加字幕流

前面的打开摄像头、编码为H264、封装为MP4这3步好歹还有官方的示例，到了字幕这里就要挠头了，不仅没有官方示例，而且就算找到了API是 avcodec_encode_subtitle 网上的相关资料也只有只言片语。

先回过头来说字幕，常见的字幕形式有3种： 1. “硬编码”字幕，也可以称为内嵌字幕，即把字幕内容编码在每一帧视频画面上，这种字幕一旦生成就固定在视频画面上了无法调整其格式、位置，但好处是不需要播放器支持； 2. 外挂字幕，即常见的srt、ssa等文件，和视频文件一起在播放器中打开，即可实现字幕功能，可以方便地控制是否显示，并调整其位置、字体格式； 3. 内挂字幕，其实效果和技术和外挂字幕差不多，只是和视频、音频等封装到了一个容器中，比外挂字幕使用更加方便。

那么先了解下字幕格式，常见的是SRT(Subripper)，是最简单的字幕格式；SSA（Sub Station Alpha）是为了解决SRT过于简单的字幕功能而开发的高级字幕格式，其文件扩展名为.SSA；ASS（Advanced SubStation Alpha）其实是更高级的SSA版本，采用SSA V4+ 脚本语言编写。

这篇文章要说的就是通过FFMPEG生成第3种字幕，简单来说相当于在容器中添加一个ass文件。那么先看下网上能找到的不多的FFMPEG字幕处理示例之一(http://pastebin.com/cUxCs33a)，注意例子我改了一下，原来是输出srt，现在是输出ass。

void saveSubtitle(AVFormatContext *context, Stream stream)
{
    stringstream outfile;
    outfile << "/tmp/subtitle_" << stream.index << ".ass";
    string filename = outfile.str();

    AVStream *avstream = context->streams[stream.index];
    AVCodec *codec = avcodec_find_decoder(avstream->codec->codec_id);

    int result = avcodec_open2(avstream->codec, codec, NULL);
    checkResult(result == 0, "Error opening codec");
    cerr << "found codec: " << codec << ", open result= " << result << endl;

    AVOutputFormat *outFormat = av_guess_format(NULL, filename.c_str(), NULL);
    checkResult(outFormat != NULL, "Error finding format");
    cerr << "Found output format: " << outFormat->name << " (" << outFormat->long_name << ")" << endl;

    AVFormatContext *outFormatContext;
    avformat_alloc_output_context2(&outFormatContext, NULL, NULL, filename.c_str());
    AVCodec *encoder = avcodec_find_encoder(outFormat->subtitle_codec);
    checkResult(encoder != NULL, "Error finding encoder");
    cerr << "Found encoder: " << encoder->name << endl;

    AVStream *outStream = avformat_new_stream(outFormatContext, encoder);
    checkResult(outStream != NULL, "Error allocating out stream");

    AVCodecContext *c = outStream->codec;
    result = avcodec_get_context_defaults3(c, encoder);
    checkResult(result == 0, "error on get context default");

    cerr << "outstream codec: " << outStream->codec << endl;
    cerr << "Opened stream " << outStream->id << ", codec=" << outStream->codec->codec_id << endl;

    result = avio_open(&outFormatContext->pb, filename.c_str(), AVIO_FLAG_WRITE);
    checkResult(result == 0, "Error opening out file");
    cerr << "out file opened correctly" << endl;

    result = avformat_write_header(outFormatContext, NULL);
    checkResult(result == 0, "Error writing header");
    cerr << "header wrote correctly" << endl;

    result = 0;
    
    AVPacket pkt;
    av_init_packet(&pkt);
    pkt.data = NULL;
    pkt.size = 0;

    cerr << "srt codec id: " << AV_CODEC_ID_SUBRIP << endl;
    while (av_read_frame(context, &pkt) >= 0)
    {
        if (pkt.stream_index != stream.index)
          continue;
        int gotSubtitle = 0;
        AVSubtitle subtitle;
        result = avcodec_decode_subtitle2(avstream->codec, &subtitle, &gotSubtitle, &pkt);
        uint64_t bufferSize = 1024 * 1024;
        uint8_t *buffer = new uint8_t[bufferSize];
        memset(buffer, 0, bufferSize);
        if (result >= 0)
        {
            result = avcodec_encode_subtitle(outStream->codec, buffer, bufferSize, &subtitle);
            cerr << "Encode subtitle result: " << result << endl;
        }

        cerr << "Encoded subtitle: " << buffer << endl;
        delete[] buffer;
    }
}

得到这段读取容器中的字幕流并输出ass文件的代码之后，就该想办法来往容器中添加字幕流了，首先得知道字幕编码器用哪个，也就是上述代码中的outFormat->subtitle_codec是什么。方法就是自己先生成一个带内挂ASS字幕的视频（使用FFMPEG命令行，参考https://trac.ffmpeg.org/wiki/HowToBurnSubtitlesIntoVideo），然后拿上面的代码跑一下就行了。

做完这一步后可以得知，ASS编码器的ID是 AV_CODEC_ID_MOV_TEXT，结合上一篇文章，那开始写往容器中添加字幕流的代码吧：

AVFormatContext *outFormatContext;
avformat_alloc_output_context2( &outFormatContext, NULL, NULL, filename.c_str() );

AVCodec *encoder = avcodec_find_encoder( AV_CODEC_ID_MOV_TEXT );
AVStream *outStream = avformat_new_stream( outFormatContext, encoder );

outStream->time_base = (AVRational){1,25};  //也要设置时基

AVCodecContext *c = outStream->codec;
avcodec_get_context_defaults3( c, encoder );

c->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;

c->codec_type = AVMEDIA_TYPE_SUBTITLE;
c->codec_id = AV_CODEC_ID_MOV_TEXT;

if (avcodec_open2(c, encoder, NULL) < 0)
    cerr << "can not open encoder"<< endl;

上面只是示例代码，为什么我在代码里面别的地方都没判断是否成功，只在最后一句avcodec_open2上判断了一下呢？因为这里肯定会失败的......

ASS编码器打开失败后，简直是一筹莫展，因为网上没有任何资料说这个编码器要怎么用，完全不知道是哪里出了问题。挠头了好久之后，想想第一段代码是可以跑的，那就拿来debug一下呗，看看在第一段代码里面FFMPEG还做了些什么？

因为windows下用的是别人编译好的库，所以在Ubuntu下编译了一个FFMPEG，然后用GDB开始调试，话说在Linux下编译FFMPEG实在是简单啊......

调试的过程略过，最后发现在FFMPEG里面，使用ASS编码器时，是一定要给AVCodecContext设置一个subtitle_header（当然还有subtitle_header_size）的，源代码在ass.c中：https://www.ffmpeg.org/doxygen/2.8/ass_8c_source.html，如下：

int ff_ass_subtitle_header(AVCodecContext *avctx,
                           const char *font, int font_size,
                           int color, int back_color,
                           int bold, int italic, int underline,
                           int border_style, int alignment)
{
    avctx->subtitle_header = av_asprintf(
             "[Script Info]\r\n"
             "; Script generated by FFmpeg/Lavc%s\r\n"
             "ScriptType: v4.00+\r\n"
             "PlayResX: %d\r\n"
             "PlayResY: %d\r\n"
             "\r\n"
             "[V4+ Styles]\r\n"

             /* ASSv4 header */
             "Format: Name, "
             "Fontname, Fontsize, "
             "PrimaryColour, SecondaryColour, OutlineColour, BackColour, "
             "Bold, Italic, Underline, StrikeOut, "
             "ScaleX, ScaleY, "
             "Spacing, Angle, "
             "BorderStyle, Outline, Shadow, "
             "Alignment, MarginL, MarginR, MarginV, "
             "Encoding\r\n"

             "Style: "
             "Default,"             /* Name */
             "%s,%d,"               /* Font{name,size} */
             "&H%x,&H%x,&H%x,&H%x," /* {Primary,Secondary,Outline,Back}Colour */
             "%d,%d,%d,0,"          /* Bold, Italic, Underline, StrikeOut */
             "100,100,"             /* Scale{X,Y} */
             "0,0,"                 /* Spacing, Angle */
             "%d,1,0,"              /* BorderStyle, Outline, Shadow */
             "%d,10,10,10,"         /* Alignment, Margin[LRV] */
             "0\r\n"                /* Encoding */

             "\r\n"
             "[Events]\r\n"
             "Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\r\n",
             !(avctx->flags & AV_CODEC_FLAG_BITEXACT) ? AV_STRINGIFY(LIBAVCODEC_VERSION) : "",
             ASS_DEFAULT_PLAYRESX, ASS_DEFAULT_PLAYRESY,
             font, font_size, color, color, back_color, back_color,
             -bold, -italic, -underline, border_style, alignment);

    if (!avctx->subtitle_header)
        return AVERROR(ENOMEM);
    avctx->subtitle_header_size = strlen(avctx->subtitle_header);
    return 0;
}

这个header的格式是比较容易看懂的，唯一要注意的是字体，比如声明了要用Arial字体，记得把arial.ttf文件和应用程序放在一起，这样FFMPEG才能加载到字体。

把这段代码抄过来，在avcodec_open2之前先给AVCodecContext设置一遍，这样就可以成功打开编码器了。那么只剩最后一步了：要生成一个字幕(AVSubtitle)，然后 avcodec_encode_subtitle，最后把得到的packet写入到容器中。注意ASS字幕的文本格式是这种格式的："Dialogue: 0,0:00:00.00,0:00:10.00,Default,,0,0,0,,这是头十秒的字幕"。示例代码如下：

AVSubtitleRect** subRects = new AVSubtitleRect*[1];
subRects[0] = new AVSubtitleRect;
subRects[0]->type = AVSubtitleType::SUBTITLE_ASS;
/*
把UTF-8格式的字幕内容copy进来，注意文本格式是“Dialogue: 0,0:00:00.00,0:00:10.00,Default,,0,0,0,,这是头十秒的字幕”
subRects[0]->ass = ...
*/

AVSubtitle *sub = new AVSubtitle;
sub->format = 1;
sub->num_rects = 1;
sub->rects = subRects;
/*
设置好pts、开始显示时间、结束显示时间
sub->pts = ...
sub->start_display_time = ...
sub->end_display_time = ...
*/

int encodeSize = avcodec_encode_subtitle(c, buffer, bufferSize, sub);
if ( 0 < encodeSize )
{
    AVPacket packet;
    av_init_packet(&packet);
    packet.data = buffer;
    packet.size = encodeSize;
    /*
    写入字幕包
    packet.stream_index = ...
    av_interleaved_write_frame(...
    */
}

回过头来看，FFMPEG确实是一个非常强大的音视频处理库，基本上你想要的功能它都有，但是能找到的使用资料却都是使用它的命令行工具的，使用FFMPEG进行编码开发基本是困难重重，在找不到资料时就要阅读它的源代码，进行Debug，参考它的命令行工具是如何实现的了。同时也希望我的这些总结，能给需要帮助的人提供一个参考。