當(dāng)前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

ffmpeg architecture（中）

發(fā)布時(shí)間：2023/11/28 生活经验 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 ffmpeg architecture（中）小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

艱苦學(xué)習(xí)FFmpeg libav

您是否不奇怪有時(shí)會(huì)發(fā)出聲音和視覺？

由于FFmpeg作為命令行工具非常有用，可以對(duì)媒體文件執(zhí)行基本任務(wù)，因此如何在程序中使用它？

FFmpeg 由幾個(gè)庫組成，這些庫可以集成到我們自己的程序中。通常，當(dāng)您安裝FFmpeg時(shí)，它將自動(dòng)安裝所有這些庫。我將這些庫的集合稱為FFmpeg libav。

此標(biāo)題是對(duì)Zed Shaw的系列“ Learn X the Hard Way”（特別是他的書“ Learn C the Hard Way” ）的致敬。

第0章-臭名昭著的你好世界

您好世界實(shí)際上不會(huì)"hello world"在終端中顯示消息👅 相反，我們將打印出有關(guān)視頻的信息，例如其格式（容器），時(shí)長(zhǎng)，分辨率，音頻通道之類的信息，最后，我們將解碼一些幀并將其保存為圖像文件。

FFmpeg libav體系結(jié)構(gòu)

但是在開始編碼之前，讓我們學(xué)習(xí)FFmpeg libav架構(gòu)如何工作以及其組件如何與其他組件通信。

這是解碼視頻的過程：

首先，您需要將媒體文件加載到名為AVFormatContext（視頻容器也稱為格式）的組件中。實(shí)際上，它并未完全加載整個(gè)文件：它通常僅讀取標(biāo)頭。

加載容器的最小標(biāo)頭后，就可以訪問其流（將其視為基本的音頻和視頻數(shù)據(jù)）。每個(gè)流都可以在名為的組件中使用AVStream。

流是連續(xù)數(shù)據(jù)流的奇特名稱。

假設(shè)我們的視頻有兩個(gè)流：用AAC CODEC編碼的音頻和用H264（AVC）CODEC編碼的視頻。從每個(gè)流中，我們可以提取稱為數(shù)據(jù)包的數(shù)據(jù)片段（切片），這些數(shù)據(jù)將加載到名為的組件中AVPacket。

該包內(nèi)的數(shù)據(jù)仍然編碼（壓縮），并以數(shù)據(jù)包進(jìn)行解碼，我們需要將它們傳遞給特定的AVCodec。

在AVCodec將它們解碼成AVFrame最后，該組件為我們提供了非壓縮幀。注意，音頻和視頻流使用相同的術(shù)語/過程。

要求

由于有些人在編譯或運(yùn)行我們將Docker用作開發(fā)/ 運(yùn)行器環(huán)境的示例時(shí)遇到問題，因此，我們還將使用大型的兔子視頻，因此，如果您在本地沒有該視頻，請(qǐng)運(yùn)行命令make fetch_small_bunny_video。

第0章-代碼演練

TLDR；給我看代碼和執(zhí)行。

$ make run_hello

我們將跳過一些細(xì)節(jié)，但是請(qǐng)放心：源代碼可在github上找到。

我們將分配內(nèi)存給AVFormatContext將保存有關(guān)格式（容器）信息的組件。

AVFormatContext * pFormatContext = avformat_alloc_context（）;

現(xiàn)在，我們將打開文件并讀取其標(biāo)頭，并AVFormatContext使用有關(guān)該格式的最少信息填充（注意，通常不會(huì)打開編解碼器）。用于執(zhí)行此操作的函數(shù)是avformat_open_input。它需要一個(gè)AVFormatContext，一個(gè)filename和兩個(gè)可選參數(shù)：（AVInputFormat如果通過NULL，則FFmpeg會(huì)猜測(cè)格式）和AVDictionary（這是解復(fù)用器的選項(xiàng)）。

avformat_open_input（＆pFormatContext，filename，NULL，NULL）;

我們可以打印格式名稱和媒體持續(xù)時(shí)間：

printf（“格式％s，持續(xù)時(shí)間％lld us ”，pFormatContext-> iformat-> long_name，pFormatContext-> duration）；

要訪問streams，我們需要從媒體讀取數(shù)據(jù)。該功能可以avformat_find_stream_info做到這一點(diǎn)。現(xiàn)在，pFormatContext->nb_streams將保留流的數(shù)量，并且pFormatContext->streams[i]將為我們提供i流（an AVStream）。

avformat_find_stream_info（pFormatContext， NULL）;

現(xiàn)在，我們將遍歷所有流。

對(duì)于（int i = 0 ; i nb_streams; i ++）{ // }

對(duì)于每個(gè)流，我們將保留AVCodecParameters，它描述了該流使用的編解碼器的屬性i。

AVCodecParameters * pLocalCodecParameters = pFormatContext-> streams [i]-> codecpar;

隨著編解碼器的屬性，我們可以看一下正確的CODEC查詢功能avcodec_find_decoder，并找到注冊(cè)解碼器編解碼器ID并返回AVCodec，知道如何連接部件有限公司德和DEC ODE流。

AVCodec * pLocalCodec = avcodec_find_decoder（pLocalCodecParameters-> codec_id）;

現(xiàn)在我們可以打印有關(guān)編解碼器的信息。

//特定視頻和音頻如果（pLocalCodecParameters-> codec_type == AVMEDIA_TYPE_VIDEO）{ printf的（ “視頻編解碼器：分辨率％d X ％d ”，pLocalCodecParameters->寬度，pLocalCodecParameters->高度）;} 否則如果（pLocalCodecParameters-> codec_type == AVMEDIA_TYPE_AUDIO）{ printf的（“音頻編解碼器：％d通道，采樣率％d ”，pLocalCodecParameters-> 通道，pLocalCodecParameters-> SAMPLE_RATE）;}// //常規(guī)printf（ “ \ t編解碼器％s ID ％d bit_rate ％lld ”，pLocalCodec-> long_name，pLocalCodec-> id，pCodecParameters-> bit_rate）;

使用編解碼器，我們可以為分配內(nèi)存，該內(nèi)存AVCodecContext將保存我們的解碼/編碼過程的上下文，但是隨后我們需要使用CODEC參數(shù)填充此編解碼器上下文；我們這樣做avcodec_parameters_to_context。

填充編解碼器上下文后，我們需要打開編解碼器。我們調(diào)用該函數(shù)avcodec_open2，然后就可以使用它了。

AVCodecContext * pCodecContext = avcodec_alloc_context3（pCodec）;avcodec_parameters_to_context（pCodecContext，pCodecParameters）;avcodec_open2（pCodecContext，pCodec，NULL）;

現(xiàn)在，我們打算從流中讀取數(shù)據(jù)包，并將其解碼為幀，但首先，我們需要為這兩個(gè)組件的分配內(nèi)存AVPacket和AVFrame。

AVPacket * pPacket = av_packet_alloc（）;AVFrame * pFrame = av_frame_alloc（）;

讓我們?cè)诤瘮?shù)av_read_frame有數(shù)據(jù)包時(shí)從流中提供數(shù)據(jù)包。

while（av_read_frame（pFormatContext，pPacket）> = 0）{ // … }

讓我們使用函數(shù)通過編解碼器上下文將原始數(shù)據(jù)包（壓縮幀）發(fā)送到解碼器avcodec_send_packet。

avcodec_send_packet（pCodecContext，pPacket）;

然后，我們使用function通過相同的編解碼器上下文從解碼器接收原始數(shù)據(jù)幀（未壓縮的幀）avcodec_receive_frame。

avcodec_receive_frame（pCodecContext，pFrame）;

我們可以打印幀號(hào)，PTS，DTS，幀類型等。

printf（ “幀％c（％d）點(diǎn)％d dts ％d key_frame ％d [coded_picture_number ％d，display_picture_number ％d ] ”， av_get_picture_type_char（pFrame-> pict_type）， pCodecContext-> frame_number， pFrame-> pts， pFrame-> pkt_dts， pFrame-> key_frame， pFrame-> coded_picture_number， pFrame-> display_picture_number）;

最后，我們可以將解碼后的幀保存為簡(jiǎn)單的灰度圖像。該過程非常簡(jiǎn)單，我們將使用pFrame->data索引與平面Y，Cb和Cr相關(guān)的位置，我們剛剛選擇0（Y）保存灰度圖像。

save_gray_frame（pFrame-> data [ 0 ]，pFrame-> linesize [ 0 ]，pFrame-> width，pFrame-> height，frame_filename）; static void save_gray_frame（unsigned char * buf，int wrap，int xsize，int ysize，char * filename）{ 文件 * f; 詮釋 I; f = fopen（文件名，“ w ”）； //編寫pgm文件格式所需的最小標(biāo)頭 //便攜式灰度圖格式-> https://en.wikipedia.org/wiki/Netpbm_format#PGM_example fprintf（f，“ P5 \ n ％d ％d \ n ％d \ n “，xsize，ysize，255）; // 為（i = 0 ; i <ysize; i ++）逐行編寫fwrite（buf + i * wrap， 1，xsize，f）; fclose（f）;}

第1章-同步音頻和視頻

成為播放器 -一個(gè)年輕的JS開發(fā)人員，編寫新的MSE視頻播放器。

在開始編寫轉(zhuǎn)碼示例代碼之前，我們先談一下定時(shí)，或者視頻播放器如何知道正確的時(shí)間播放幀。

在上一個(gè)示例中，我們保存了一些可以在此處看到的幀：

在設(shè)計(jì)視頻播放器時(shí)，我們需要以給定的速度播放每一幀，否則，由于播放的速度太快或太慢，很難令人愉快地觀看視頻。

因此，我們需要引入一些邏輯來平穩(wěn)地播放每個(gè)幀。為此，每個(gè)幀具有表示時(shí)間戳（PTS），其是在時(shí)基中分解的遞增數(shù)字，該時(shí)基是可被幀速率（fps）整除的有理數(shù)（其中分母稱為時(shí)間標(biāo)度）。

當(dāng)我們看一些示例時(shí)，更容易理解，讓我們模擬一些場(chǎng)景。

對(duì)于fps=60/1，timebase=1/60000每個(gè)PTS都會(huì)增加，timescale / fps = 1000因此每個(gè)幀的PTS實(shí)時(shí)可能是（假設(shè)從0開始）：

frame=0, PTS = 0, PTS_TIME = 0
frame=1, PTS = 1000, PTS_TIME = PTS * timebase = 0.016
frame=2, PTS = 2000, PTS_TIME = PTS * timebase = 0.033

對(duì)于幾乎相同的情況，但時(shí)基等于1/60。

frame=0, PTS = 0, PTS_TIME = 0
frame=1, PTS = 1, PTS_TIME = PTS * timebase = 0.016
frame=2, PTS = 2, PTS_TIME = PTS * timebase = 0.033
frame=3, PTS = 3, PTS_TIME = PTS * timebase = 0.050

對(duì)于fps=25/1和timebase=1/75每個(gè)PTS將增加timescale / fps = 3和PTS時(shí)間可能是：

frame=0, PTS = 0, PTS_TIME = 0
frame=1, PTS = 3, PTS_TIME = PTS * timebase = 0.04
frame=2, PTS = 6, PTS_TIME = PTS * timebase = 0.08
frame=3, PTS = 9, PTS_TIME = PTS * timebase = 0.12
…
frame=24, PTS = 72, PTS_TIME = PTS * timebase = 0.96
…
frame=4064, PTS = 12192, PTS_TIME = PTS * timebase = 162.56

現(xiàn)在，借助，pts_time我們可以找到一種方法來呈現(xiàn)與音頻pts_time或系統(tǒng)時(shí)鐘同步的同步。FFmpeg libav通過其API提供以下信息：

fps
= AVStream->avg_frame_rate
tbr
= AVStream->r_frame_rate
tbn
= AVStream->time_base

出于好奇，我們保存的幀以DTS順序發(fā)送（幀：1、6、4、2、3、5），但以PTS順序播放（幀：1、2、3、4、5）。另外，請(qǐng)注意，B幀與P幀或I幀相比價(jià)格便宜。

LOG: AVStream->r_frame_rate 60/1LOG: AVStream->time_base 1/60000…LOG: Frame 1 (type=I, size=153797 bytes) pts 6000 key_frame 1 [DTS 0]LOG: Frame 2 (type=B, size=8117 bytes) pts 7000 key_frame 0 [DTS 3]LOG: Frame 3 (type=B, size=8226 bytes) pts 8000 key_frame 0 [DTS 4]LOG: Frame 4 (type=B, size=17699 bytes) pts 9000 key_frame 0 [DTS 2]LOG: Frame 5 (type=B, size=6253 bytes) pts 10000 key_frame 0 [DTS 5]LOG: Frame 6 (type=P, size=34992 bytes) pts 11000 key_frame 0 [DTS 1]

第2章-重新混合

重塑是將一種格式（容器）更改為另一種格式的行為，例如，我們可以使用FFmpeg 輕松地將MPEG-4視頻更改為MPEG-TS：

ffmpeg input.mp4 -c復(fù)制output.ts

它將對(duì)mp4進(jìn)行解復(fù)用，但不會(huì)對(duì)其進(jìn)行解碼或編碼（-c copy），最后，會(huì)將其復(fù)用為mpegts文件。如果您不提供格式，-f則ffmpeg會(huì)嘗試根據(jù)文件擴(kuò)展名猜測(cè)它。

FFmpeg或libav的一般用法遵循模式/體系結(jié)構(gòu)或工作流程：

協(xié)議層 -接受input（file例如，但也可以是rtmp或HTTP輸入）
格式層 -它demuxes的內(nèi)容，主要顯示元數(shù)據(jù)及其流
編解碼器層 -decodes壓縮流數(shù)據(jù)可選
像素層 -也可以將其應(yīng)用于filters原始幀（如調(diào)整大小）可選
然后它做反向路徑
編解碼器層 -它encodes（或re-encodes什至transcodes）原始幀是可選的
格式層 -它muxes（或remuxes）原始流（壓縮數(shù)據(jù)）
協(xié)議層 -最終將多路復(fù)用的數(shù)據(jù)發(fā)送到output（另一個(gè)文件或網(wǎng)絡(luò)遠(yuǎn)程服務(wù)器）

此圖受到雷小華和Slhck的作品的強(qiáng)烈啟發(fā)。

現(xiàn)在，讓我們使用libav編寫示例，以提供與中相同的效果ffmpeg input.mp4 -c copy output.ts。

我們將從一個(gè)輸入（input_format_context）讀取并將其更改為另一個(gè)輸出（output_format_context）。

AVFormatContext * input_format_context = NULL ;AVFormatContext * output_format_context = NULL ;

我們開始進(jìn)行通常的分配內(nèi)存并打開輸入格式。對(duì)于這種特定情況，我們將打開一個(gè)輸入文件并為輸出文件分配內(nèi)存。

if（（ret = avformat_open_input（＆input_format_context，in_filename，NULL，NULL））< 0）{ fprintf（stderr，“無法打開輸入文件’ ％s ’ ”，in_filename）; 轉(zhuǎn)到結(jié)尾}if（（ret = avformat_find_stream_info（input_format_context，NULL））< 0）{ fprintf（stderr，“無法檢索輸入流信息”）; 轉(zhuǎn)到結(jié)尾} avformat_alloc_output_context2（＆output_format_context，NULL，NULL，out_filename）;if（！output_format_context）{ fprintf（stderr，“無法創(chuàng)建輸出上下文\ n ”）; ret = AVERROR_UNKNOWN; 轉(zhuǎn)到結(jié)尾}

我們將只重新混合流的視頻，音頻和字幕類型，因此我們將要使用的流保留到索引數(shù)組中。

number_of_streams = input_format_context-> nb_streams;stream_list = av_mallocz_array（stream_numbers，sizeof（* streams_list））;

分配完所需的內(nèi)存后，我們將遍歷所有流，并需要使用avformat_new_stream函數(shù)為每個(gè)流在輸出格式上下文中創(chuàng)建新的輸出流。請(qǐng)注意，我們標(biāo)記的不是視頻，音頻或字幕的所有流，因此我們可以在以后跳過它們。

對(duì)于（i = 0 ; i <input_format_context-> nb_streams; i ++）{ AVStream * out_stream; AVStream * in_stream = input_format_context-> 流 [i]; AVCodecParameters * in_codecpar = in_stream-> codecpar ; 如果（in_codecpar-> codec_type！= AVMEDIA_TYPE_AUDIO && in_codecpar-> codec_type！= AVMEDIA_TYPE_VIDEO && in_codecpar-> codec_type！= AVMEDIA_TYPE_SUBTITLE）{ stream_list [i] = -1 ; 繼續(xù) ; } stream_list [i] = stream_index ++; out_stream = avformat_new_stream（output_format_context，NULL）; if（！out_stream）{ fprintf（stderr，“無法分配輸出流\ n ”）; ret = AVERROR_UNKNOWN; 轉(zhuǎn)到結(jié)尾 } ret = avcodec_parameters_copy（out_stream-> codecpar，in_codecpar）; if（ret < 0）{ fprintf（stderr，“復(fù)制編解碼器參數(shù)失敗\ n ”）; 轉(zhuǎn)到結(jié)尾 }}

現(xiàn)在我們可以創(chuàng)建輸出文件了。

如果（！（output_format_context-> oformat-> flags和AVFMT_NOFILE））{ ret = avio_open（＆output_format_context-> pb，out_filename，AVIO_FLAG_WRITE）; if（ret < 0）{ fprintf（stderr，“無法打開輸出文件’ ％s ’ ”，out_filename）; 轉(zhuǎn)到結(jié)尾 }} ret = avformat_write_header（output_format_context，NULL）;if（ret < 0）{ fprintf（stderr，“打開輸出文件時(shí)發(fā)生錯(cuò)誤\ n ”）; 轉(zhuǎn)到結(jié)尾}

之后，我們可以逐個(gè)數(shù)據(jù)包地將流從輸入復(fù)制到輸出流。我們將在它有數(shù)據(jù)包（av_read_frame）時(shí)循環(huán)播放，對(duì)于每個(gè)數(shù)據(jù)包，我們需要重新計(jì)算PTS和DTS以最終將其（av_interleaved_write_frame）寫入輸出格式上下文。

而（1）{ AVStream * in_stream，* out_stream; ret = av_read_frame（input_format_context，＆packet）; 如果（ret < 0）中斷 ; in_stream = input_format_context-> 流 [數(shù)據(jù)包。stream_index ]; 如果（分組。stream_index > = number_of_streams || streams_list [數(shù)據(jù)包。stream_index ] < 0）{ av_packet_unref（包）; 繼續(xù) ; } 包。stream_index = stream_list [數(shù)據(jù)包。stream_index ]; out_stream = output_format_context-> 流 [數(shù)據(jù)包。stream_index ]; / 復(fù)制數(shù)據(jù)包 / 數(shù)據(jù)包。pts = av_rescale_q_rnd（數(shù)據(jù)包pts，in_stream-> time_base，out_stream-> time_base，AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX）；包。dts = av_rescale_q_rnd（數(shù)據(jù)包dts，in_stream-> time_base，out_stream-> time_base，AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX）；包。持續(xù)時(shí)間 = av_rescale_q（數(shù)據(jù)包duration，in_stream-> time_base，out_stream-> time_base）; // https://ffmpeg.org/doxygen/trunk/structAVPacket.html#ab5793d8195cf4789dfb3913b7a693903 數(shù)據(jù)包。pos = -1 ; // https://ffmpeg.org/doxygen/trunk/group__lavf__encoding.html#ga37352ed2c63493c38219d935e71db6c1 ret = av_interleaved_write_frame（output_format_context，＆packet）; if（ret < 0）{ fprintf（stderr， “錯(cuò)誤合并數(shù)據(jù)包\ n ”）; 休息 ; } av_packet_unref（＆packet）;}

最后，我們需要使用av_write_trailer函數(shù)將流預(yù)告片寫入輸出媒體文件。

av_write_trailer（output_format_context）;

現(xiàn)在我們準(zhǔn)備對(duì)其進(jìn)行測(cè)試，并且第一個(gè)測(cè)試將是從MP4到MPEG-TS視頻文件的格式（視頻容器）轉(zhuǎn)換。我們基本上是ffmpeg input.mp4 -c
copy output.ts使用libav 制作命令行。

使run_remuxing_ts

工作正常！！！可以通過以下方法進(jìn)行檢查ffprobe：

ffprobe -i remuxed_small_bunny_1080p_60fps.ts從’remuxed_small_bunny_1080p_60fps.ts’ 輸入＃ 0，mpegts：持續(xù)時(shí)間：00：00：10.03，開始：0.000000，比特率：2751 kb / s 程序1 元數(shù)據(jù)： service_name ：服務(wù) 01 service_provider：FFmpeg 流＃ 0：0 [0x100]：視頻：h264（高）（[27] [0] [0] [0] / 0x001B），yuv420p（逐行），1920x1080 [SAR 1：1 DAR 16：9]，60 fps，60 tbr，90k tbn，120 tbc 流＃ 0：1 [0x101]：音頻：ac3（[129] [0] [0] [0] / 0x0081），48000 Hz，5.1（側(cè)面），fltp，320 kb /秒

總結(jié)一下我們?cè)趫D中所做的事情，我們可以回顧一下關(guān)于libav如何工作的最初想法，但表明我們跳過了編解碼器部分。

在結(jié)束本章之前，我想展示重混合過程的重要部分，您可以將選項(xiàng)傳遞給多路復(fù)用器。假設(shè)我們要為此提供MPEG-DASH格式，我們需要使用分段的mp4（有時(shí)稱為fmp4）代替MPEG-TS或純MPEG-4。

使用命令行，我們可以輕松地做到這一點(diǎn)。

ffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov+default_base_moof fragmented.mp4

由于命令行是libav版本，因此幾乎同樣容易，我們只需要在復(fù)制數(shù)據(jù)包之前在寫入輸出標(biāo)頭時(shí)傳遞選項(xiàng)即可。

AVDictionary * opts = NULL ;av_dict_set（＆opts，“ movflags ”，“ frag_keyframe + empty_moov + default_base_moof ”，0）;ret = avformat_write_header（output_format_context，＆opts）;

現(xiàn)在，我們可以生成此分段的mp4文件：

制作run_remuxing_fragmented_mp4

但是要確保我沒有對(duì)你說謊。您可以使用令人驚嘆的site / tool gpac / mp4box.js或網(wǎng)站http://mp4parser.com/來查看差異，首先加載“常用” mp4。

如您所見，它只有一個(gè)mdat原子/盒子，這是視頻和音頻幀所在的位置。現(xiàn)在加載零碎的mp4，以查看它如何散布mdat盒子。

總結(jié)

以上是生活随笔為你收集整理的ffmpeg architecture（中）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： ffmpeg architecture（
下一篇： ffmpeg architecture（