瀏覽器最新支持的WebCodecs 到底是什么？

家好，很高興又見面了，我是"高前端?進階?"，由我帶著大家一起關注前端前沿、深入前端底層技術，大家一起進步，也歡迎大家關注、點贊、收藏、轉發!

高前端?進階

前言

現代技術提供了豐富的視頻處理方式，比如 Media Stream API、Media Recording API、Media Source API 和 WebRTC API 共同組成了一個用于錄制、傳輸和播放視頻流的豐富工具集。

在Chrome 94+版本上已經支持了WebCodecs！

在解決某些高級任務時，這些 API 不允許 Web 開發者處理視頻流的各個組成部分，例如幀、未混合的編碼視頻或音頻塊。為了獲得對這些基本組件的底層訪問，開發人員一直在使用 WebAssembly 將視頻和音頻編解碼器引入瀏覽器。但鑒于現代瀏覽器已經附帶了各種編解碼器，將它們重新打包為 WebAssembly 似乎是對人力和計算機資源的浪費。

WebCodecs API 為開發者提供了一種使用瀏覽器中已經存在的媒體組件的方法，從而提高了效率。具體包括以下部分：

視頻和音頻解碼器
原始視頻幀
圖像解碼器

WebCodecs API 對于需要完全控制媒體內容處理方式的 Web 應用程序非常有用，例如視頻編輯器、視頻會議、視頻流等。

1.視頻處理工作流程

幀是視頻處理的核心。因此，在 WebCodecs 中，大多數類要么消費幀，要么生產幀。視頻編碼器將幀轉換為編碼塊，而視頻解碼器則相反。可以通過如下方法判斷瀏覽器是否支持WebCodecs：

if (window.isSecureContext) {
  // 頁面上下文完全，同時serviceWorker加載完成
  navigator.serviceWorker.register("/offline-worker.js").then(()=> {
  });
}
if ('VideoEncoder' in window) {
  // 支持WebCodecs API
}

請記住，WebCodecs 僅在安全上下文中可用，因此如果 self.isSecureContext 為 false，檢測將失敗！

VideoFrame 通過成為 CanvasImageSource 并具有接受 CanvasImageSource 的構造函數，可以很好地與其他 Web API 配合使用。所以它可以用在 drawImage() 和 texImage2D() 等函數中。它也可以由畫布、位圖、視頻元素和其他視頻幀構建。VideoFrame的構造函數如下:

new VideoFrame(image)
new VideoFrame(image, options)
new VideoFrame(data, options)
// 第二個參數為配置對象

下面是VideoFrame的一個典型示例:

const pixelSize=4;
const init={timestamp: 0, codedWidth: 320, codedHeight: 200, format: 'RGBA'};
let data=new Uint8Array(init.codedWidth * init.codedHeight * pixelSize);
for (let x=0; x < init.codedWidth; x++) {
  for (let y=0; y < init.codedHeight; y++) {
    let offset=(y * init.codedWidth + x) * pixelSize;
    data[offset]=0x7F;      
    // Red
    data[offset + 1]=0xFF;  
    // Green
    data[offset + 2]=0xD4; 
    // Blue
    data[offset + 3]=0x0FF; 
    // Alpha
  }
}
let frame=new VideoFrame(data, init);

WebCodecs API 與 Insertable Streams API 中的類協同工作，后者將 WebCodecs 連接到媒體流軌道（Media Stream Tracks）。

MediaStreamTrack 接口表示流中的單個媒體軌道；通常，這些是音頻或視頻軌道，但也可能存在其他軌道類型。

MediaStreamTrackProcessor 將媒體軌道分解為單獨的幀。
MediaStreamTrackGenerator 從幀流創建媒體軌道。

2.WebCodecs 和Web Worker

根據設計，WebCodecs API 異步完成所有繁重的工作并脫離主線程。但是由于框架和塊回調通常可以每秒調用多次，它們可能會使主線程混亂，從而降低網站的響應速度。因此，最好將單個幀和編碼塊的處理轉移到Web Worker中。

為此，ReadableStream 提供了一種方便的方法來自動將來自媒體軌道的所有幀傳輸到工作程序。例如，MediaStreamTrackProcessor 可用于獲取來自網絡攝像頭的媒體流軌道的 ReadableStream。之后，流被傳輸到Web Worker，其中幀被一個一個地讀取并排隊進入 VideoEncoder。

Streams API 的 ReadableStream 接口表示字節數據的可讀流。 Fetch API 通過 Response 對象的 body 屬性提供了 ReadableStream 的具體實例。

使用
HTMLCanvasElement.transferControlToOffscreen 甚至可以在主線程之外完成渲染。如果所有高級工具都不符合要求，VideoFrame 本身是可轉移的，可以在Web Worker之間移動。

2.編碼

這一切都始于 VideoFrame，可以通過三種方式構建視頻幀。

來自畫布、圖像位圖或視頻元素等圖像源

const canvas=document.createElement("canvas");
// 在Canvas中繪制
const frameFromCanvas=new VideoFrame(canvas, { timestamp: 0 });

使用 MediaStreamTrackProcessor 從 MediaStreamTrack拉取幀

const stream=await navigator.mediaDevices.getUserMedia({
  audio: true,
  video: {
    width: { min: 1024, ideal: 1280, max: 1920 },
    height: { min: 576, ideal: 720, max: 1080 }
  }
});
// 獲取媒體幀的配置：https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia
const track=stream.getTracks()[0];
const trackProcessor=new MediaStreamTrackProcessor(track);
const reader=trackProcessor.readable.getReader();
while (true) {
  const result=await reader.read();
  // 讀取數據
  if (result.done) break;
  const frameFromCamera=result.value;
}

從 BufferSource 中的二進制像素創建幀

const pixelSize=4;
const init={
  timestamp: 0,
  codedWidth: 320,
  codedHeight: 200,
  format: "RGBA",
};
const data=new Uint8Array(init.codedWidth * init.codedHeight * pixelSize);
// 創建Uint8Array對象
for (let x=0; x < init.codedWidth; x++) {
  for (let y=0; y < init.codedHeight; y++) {
    const offset=(y * init.codedWidth + x) * pixelSize;
    data[offset]=0x7f;      // Red
    data[offset + 1]=0xff;  // Green
    data[offset + 2]=0xd4;  // Blue
    data[offset + 3]=0x0ff; // Alpha
  }
}
const frame=new VideoFrame(data, init);
// 實例化VideoFrame對象

無論那種方式，都可以使用 VideoEncoder 將幀編碼到 EncodedVideoChunk 對象中。在編碼之前，需要給 VideoEncoder 兩個 JavaScript 對象：

帶有兩個函數的初始化對象，用于處理編碼塊和錯誤。這些函數是開發人員定義的，在傳遞給 VideoEncoder 構造函數后無法更改。
編碼器配置對象，其中包含輸出視頻流的參數。您可以稍后通過調用 configure() 來更改這些參數。

如果瀏覽器不支持配置，則 configure() 方法將拋出 NotSupportedError。鼓勵您使用配置調用靜態方法
VideoEncoder.isConfigSupported() 以預先檢查配置是否受支持并等待其promise的結果。

const init={
  output: handleChunk,
  // 處理快
  error: (e)=> {
    console.log(e.message);
  },
  // 處理錯誤
};
const config={
  codec: "vp8",
  width: 640,
  height: 480,
  bitrate: 2_000_000, // 2 Mbps
  framerate: 30,
};
const { supported }=await VideoEncoder.isConfigSupported(config);
// 判斷是否支持
if (supported) {
  const encoder=new VideoEncoder(init);
  encoder.configure(config);
} else {
  // Try another config.
}

設置編碼器后就可以通過 encode() 方法接受幀了。 configure() 和 encode() 都立即返回，無需等待實際工作完成。它允許多個幀同時排隊等待編碼，而 encodeQueueSize 顯示有多少請求在隊列中等待先前的編碼完成。

如果參數或方法調用順序違反 API 約定，或者通過調用 error() 回調來解決編解碼器實現中遇到的問題，可以通過立即拋出異常來報告錯誤。如果編碼成功完成，將使用新的編碼塊作為參數調用 output() 回調。

這里的另一個重要細節是，當不再需要框架時，需要通過調用 close() 來告知它們。

let frameCounter=0;
const track=stream.getVideoTracks()[0];
const trackProcessor=new MediaStreamTrackProcessor(track);
const reader=trackProcessor.readable.getReader();
while (true) {
  const result=await reader.read();
  if (result.done) break;
  const frame=result.value;
  if (encoder.encodeQueueSize > 2) {
    // 太多幀要處理，編碼器過載，丟棄當前幀
    frame.close();
  } else {
    frameCounter++;
    const keyframe=frameCounter % 150==0;
    encoder.encode(frame, { keyFrame });
    frame.close();
  }
}

最后是通過編寫一個函數來完成編碼代碼的時候了，該函數處理來自編碼器的編碼視頻塊。通常此功能將通過網絡發送數據塊或將它們混合到媒體容器中進行存儲。

function handleChunk(chunk, metadata) {
  if (metadata.decoderConfig) {
    // Decoder needs to be configured (or reconfigured) with new parameters
    // when metadata has a new decoderConfig.
    // Usually it happens in the beginning or when the encoder has a new
    // codec specific binary configuration. (VideoDecoderConfig.description).
    fetch("/upload_extra_data", {
      method: "POST",
      headers: { "Content-Type": "application/octet-stream" },
      body: metadata.decoderConfig.description,
    });
  }
  // 真實編碼數據塊大小
  const chunkData=new Uint8Array(chunk.byteLength);
  chunk.copyTo(chunkData);
  fetch(`/upload_chunk?timestamp=${chunk.timestamp}&type=${chunk.type}`, {
    method: "POST",
    headers: { "Content-Type": "application/octet-stream" },
    body: chunkData,
  });
}

如果在某個時候您需要確保所有待處理的編碼請求都已完成，可以調用 flush() 并等待它的promise結果。

await encoder.flush();

3.解碼

設置 VideoDecoder 與 VideoEncoder 類似：創建解碼器時需要傳遞兩個函數，并將編解碼器參數提供給 configure()。

編解碼器參數因編解碼器而異。例如，H.264 編解碼器可能需要 AVCC 的二進制 blob，除非它以所謂的 Annex B 格式編碼（encoderConfig.avc={ format: "annexb" }）。

const init={
  output: handleFrame,
  error: (e)=> {
    console.log(e.message);
  },
};

const config={
  codec: "vp8",
  codedWidth: 640,
  codedHeight: 480,
};
const { supported }=await VideoDecoder.isConfigSupported(config);
if (supported) {
  const decoder=new VideoDecoder(init);
  // 實例化編碼器
  decoder.configure(config);
  // 配置編碼器
} else {
  // Try another config.
}

解碼器初始化后，您可以開始為其提供 EncodedVideoChunk 對象。要創建塊，您需要：

編碼視頻數據的 BufferSource
以微秒為單位的塊的開始時間戳（塊中第一個編碼幀的媒體時間）
塊的類型，其中之一：
key 如果塊可以獨立于以前的塊解碼
增量，如果塊只能在一個或多個先前的塊被解碼后被解碼

此外，編碼器發出的任何塊都可以按原樣為解碼器準備好。上面所說的關于錯誤報告和編碼器方法的異步性質的所有內容對于解碼器也同樣適用。

const responses=await downloadVideoChunksFromServer(timestamp);
for (let i=0; i < responses.length; i++) {
  const chunk=new EncodedVideoChunk({
    timestamp: responses[i].timestamp,
    type: responses[i].key ? "key" : "delta",
    data: new Uint8Array(responses[i].body),
  });
  decoder.decode(chunk);
}
await decoder.flush();

現在是時候展示如何在頁面上顯示新解碼的幀了。最好確保解碼器輸出回調 (handleFrame()) 快速返回。在下面的示例中，它僅將一個幀添加到準備渲染的幀隊列中。渲染是單獨發生的，由兩個步驟組成：

等待合適的時間顯示幀
在畫布上繪制幀

一旦不再需要某個幀，調用 close() 以在垃圾收集器到達它之前釋放底層內存，這將減少 Web 應用程序使用的平均內存量。

const canvas=document.getElementById("canvas");
const ctx=canvas.getContext("2d");
let pendingFrames=[];
let underflow=true;
let baseTime=0;

function handleFrame(frame) {
  pendingFrames.push(frame);
  if (underflow) setTimeout(renderFrame, 0);
}

function calculateTimeUntilNextFrame(timestamp) {
  if (baseTime==0) baseTime=performance.now();
  let mediaTime=performance.now() - baseTime;
  return Math.max(0, timestamp / 1000 - mediaTime);
}

async function renderFrame() {
  underflow=pendingFrames.length==0;
  if (underflow) return;
  const frame=pendingFrames.shift();
  // Based on the frame's timestamp calculate how much of real time waiting
  // is needed before showing the next frame.
  const timeUntilNextFrame=calculateTimeUntilNextFrame(frame.timestamp);
  await new Promise((r)=> {
    setTimeout(r, timeUntilNextFrame);
  });
  ctx.drawImage(frame, 0, 0);
  frame.close();
  // 立即啟動下一幀的調用邏輯
  setTimeout(renderFrame, 0);
}

參考資料

https://developer.chrome.com/articles/webcodecs/（Chrome官方文檔）

https://developer.mozilla.org/en-US/docs/Web/API/isSecureContext

https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts

https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream

https://www.w3.org/2020/06/machine-learning-workshop/talks/media_processing_hooks_for_the_web.html

求

將編碼的視頻流解碼為原始視頻數據,編碼視頻流可以來自網絡流或文件,解碼后即可渲染到屏幕.

實現原理

正如我們所知,編碼數據僅用于傳輸,無法直接渲染到屏幕上,所以這里利用FFmpeg解析文件中的編碼的視頻流,并將壓縮視頻數據(h264/h265)解碼為指定格式(yuv,RGB)的視頻原始數據,以渲染到屏幕上。注意: 本例主要為解碼,需要借助FFmpeg搭建模塊,視頻解析模塊,渲染模塊,這些模塊在下面閱讀前提皆有鏈接可直接訪問。

總體架構簡易流程

FFmpeg parse流程

創建format context: avformat_alloc_context
打開文件流: avformat_open_input
尋找流信息: avformat_find_stream_info
獲取音視頻流的索引值: formatContext->streams[i]->codecpar->codec_type==(isVideoStream ? AVMEDIA_TYPE_VIDEO : AVMEDIA_TYPE_AUDIO)
獲取音視頻流: m_formatContext->streams[m_audioStreamIndex]
解析音視頻數據幀: av_read_frame
獲取extra data: av_bitstream_filter_filter

FFmpeg decode流程

確定解碼器類型: enum AVHWDeviceType av_hwdevice_find_type_by_name(const char *name)
創建視頻流: int av_find_best_stream(AVFormatContext *ic,enum FfmpegaVMediaType type,int wanted_stream_nb,int related_stream,AVCodec **decoder_ret,int flags);
初始化解碼器: AVCodecContext *avcodec_alloc_context3(const AVCodec *codec)
填充解碼器上下文: int avcodec_parameters_to_context(AVCodecContext *codec, const AVCodecParameters *par);
打開指定類型的設備: int av_hwdevice_ctx_create(AVBufferRef **device_ctx, enum AVHWDeviceType type, const char *device, AVDictionary *opts, int flags)
初始化編碼器上下文對象: int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options)
初始化視頻幀: AVFrame *av_frame_alloc(void)
找到第一個I幀開始解碼: packet.flags==1
將parse到的壓縮數據送給解碼器: int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt)
接收解碼后的數據: int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame)
構造時間戳
將解碼后的數據存到CVPixelBufferRef并將其轉為CMSampleBufferRef,解碼完成

文件結構

快速使用

初始化preview

- (void)viewDidLoad {
    [super viewDidLoad];
    [self setupUI];
}

- (void)setupUI {
    self.previewView=[[XDXPreviewView alloc] initWithFrame:self.view.frame];
    [self.view addSubview:self.previewView];
    [self.view bringSubviewToFront:self.startBtn];
}

解析并解碼文件中視頻數據

- (void)startDecodeByFFmpegWithIsH265Data:(BOOL)isH265 {
    NSString *path=[[NSBundle mainBundle] pathForResource:isH265 ? @"testh265" : @"testh264" ofType:@"MOV"];
    XDXAVParseHandler *parseHandler=[[XDXAVParseHandler alloc] initWithPath:path];
    XDXFFmpegVideoDecoder *decoder=[[XDXFFmpegVideoDecoder alloc] initWithFormatContext:[parseHandler getFormatContext] videoStreamIndex:[parseHandler getVideoStreamIndex]];
    decoder.delegate=self;
    [parseHandler startParseGetAVPackeWithCompletionHandler:^(BOOL isVideoFrame, BOOL isFinish, AVPacket packet) {
        if (isFinish) {
            [decoder stopDecoder];
            return;
        }
        
        if (isVideoFrame) {
            [decoder startDecodeVideoDataWithAVPacket:packet];
        }
    }];
}

將解碼后數據渲染到屏幕上

-(void)getDecodeVideoDataByFFmpeg:(CMSampleBufferRef)sampleBuffer {
    CVPixelBufferRef pix=CMSampleBufferGetImageBuffer(sampleBuffer);
    [self.previewView displayPixelBuffer:pix];
}

C++音視頻開發學習資料：點擊領取→音視頻開發（資料文檔+視頻教程+面試題）（FFmpeg+WebRTC+RTMP+RTSP+HLS+RTP）

具體實現

1. 初始化實例對象

因為本例中的視頻數據源是文件,而format context上下文實在parse模塊初始化的,所以這里僅僅需要將其傳入解碼器即可.

- (instancetype)initWithFormatContext:(AVFormatContext *)formatContext videoStreamIndex:(int)videoStreamIndex {
    if (self=[super init]) {
        m_formatContext=formatContext;
        m_videoStreamIndex=videoStreamIndex;
        
        m_isFindIDR=NO;
        m_base_time=0;
        
        [self initDecoder];
    }
    return self;
}

2. 初始化解碼器

- (void)initDecoder {
    // 獲取視頻流
    AVStream *videoStream=m_formatContext->streams[m_videoStreamIndex];
    // 創建解碼器上下文對象
    m_videoCodecContext=[self createVideoEncderWithFormatContext:m_formatContext
                                                            stream:videoStream
                                                  videoStreamIndex:m_videoStreamIndex];
    if (!m_videoCodecContext) {
        log4cplus_error(kModuleName, "%s: create video codec failed",__func__);
        return;
    }
    
    // 創建視頻幀
    m_videoFrame=av_frame_alloc();
    if (!m_videoFrame) {
        log4cplus_error(kModuleName, "%s: alloc video frame failed",__func__);
        avcodec_close(m_videoCodecContext);
    }
}

2.1. 創建解碼器上下文對象

- (AVCodecContext *)createVideoEncderWithFormatContext:(AVFormatContext *)formatContext stream:(AVStream *)stream videoStreamIndex:(int)videoStreamIndex {
    AVCodecContext *codecContext=NULL;
    AVCodec *codec=NULL;
    
    // 指定解碼器名稱, 這里使用蘋果VideoToolbox中的硬件解碼器
    const char *codecName=av_hwdevice_get_type_name(AV_HWDEVICE_TYPE_VIDEOTOOLBOX);
    // 將解碼器名稱轉為對應的枚舉類型
    enum AVHWDeviceType type=av_hwdevice_find_type_by_name(codecName);
    if (type !=AV_HWDEVICE_TYPE_VIDEOTOOLBOX) {
        log4cplus_error(kModuleName, "%s: Not find hardware codec.",__func__);
        return NULL;
    }
    
    // 根據解碼器枚舉類型找到解碼器
    int ret=av_find_best_stream(formatContext, AVMEDIA_TYPE_VIDEO, -1, -1, &codec, 0);
    if (ret < 0) {
        log4cplus_error(kModuleName, "av_find_best_stream faliture");
        return NULL;
    }
    
    // 為解碼器上下文對象分配內存
    codecContext=avcodec_alloc_context3(codec);
    if (!codecContext){
        log4cplus_error(kModuleName, "avcodec_alloc_context3 faliture");
        return NULL;
    }
    
    // 將視頻流中的參數填充到視頻解碼器中
    ret=avcodec_parameters_to_context(codecContext, formatContext->streams[videoStreamIndex]->codecpar);
    if (ret < 0){
        log4cplus_error(kModuleName, "avcodec_parameters_to_context faliture");
        return NULL;
    }
    
    // 創建硬件解碼器上下文
    ret=InitHardwareDecoder(codecContext, type);
    if (ret < 0){
        log4cplus_error(kModuleName, "hw_decoder_init faliture");
        return NULL;
    }
    
    // 初始化解碼器上下文對象
    ret=avcodec_open2(codecContext, codec, NULL);
    if (ret < 0) {
        log4cplus_error(kModuleName, "avcodec_open2 faliture");
        return NULL;
    }
    
    return codecContext;
}

#pragma mark - C Function
AVBufferRef *hw_device_ctx=NULL;
static int InitHardwareDecoder(AVCodecContext *ctx, const enum AVHWDeviceType type) {
    int err=av_hwdevice_ctx_create(&hw_device_ctx, type, NULL, NULL, 0);
    if (err < 0) {
        log4cplus_error("XDXParseParse", "Failed to create specified HW device.\n");
        return err;
    }
    ctx->hw_device_ctx=av_buffer_ref(hw_device_ctx);
    return err;
}

av_find_best_stream : 在文件中找到最佳流信息.

ic: 媒體文件
type: video, audio, subtitles...
wanted_stream_nb: 用戶請求的流編號,-1表示自動選擇
related_stream: 試著找到一個相關的流,如果沒有可填-1
decoder_ret: 非空返回解碼器引用
flags: 保留字段
avcodec_parameters_to_context: 根據提供的解碼器參數中的值填充解碼器上下文

僅僅將解碼器中具有相應字段的任何已分配字段par被釋放并替換為par中相應字段的副本。不涉及解碼器中沒有par中對應項的字段。

av_hwdevice_ctx_create: 打開指定類型的設備并為其創建AVHWDeviceContext。
avcodec_open2: 使用給定的AVCodec初始化AVCodecContext,在使用此函數之前,必須使用avcodec_alloc_context3（）分配內存。

int av_find_best_stream(AVFormatContext *ic,
                        enum FfmpegaVMediaType type,
                        int wanted_stream_nb,
                        int related_stream,
                        AVCodec **decoder_ret,
                        int flags);

2.2. 創建視頻幀 AVFrame

作為解碼后原始的音視頻數據的容器.AVFrame通常被分配一次然后多次重復（例如，單個AVFrame以保持從解碼器接收的幀）。在這種情況下，av_frame_unref（）將釋放框架所持有的任何引用，并在再次重用之前將其重置為其原始的清理狀態。

    // Get video frame
    m_videoFrame=av_frame_alloc();
    if (!m_videoFrame) {
        log4cplus_error(kModuleName, "%s: alloc video frame failed",__func__);
        avcodec_close(m_videoCodecContext);
    }

3. 開始解碼

首先找到編碼數據流中第一個I幀, 然后調用avcodec_send_packet將壓縮數據發送給解碼器.最后利用循環接收avcodec_receive_frame解碼后的視頻數據.構造時間戳,并將解碼后的數據填充到CVPixelBufferRef中并將其轉為CMSampleBufferRef.

- (void)startDecodeVideoDataWithAVPacket:(AVPacket)packet {
    if (packet.flags==1 && m_isFindIDR==NO) {
        m_isFindIDR=YES;
        m_base_time=m_videoFrame->pts;
    }
    
    if (m_isFindIDR==YES) {
        [self startDecodeVideoDataWithAVPacket:packet
                             videoCodecContext:m_videoCodecContext
                                    videoFrame:m_videoFrame
                                      baseTime:m_base_time
                              videoStreamIndex:m_videoStreamIndex];
    }
}

- (void)startDecodeVideoDataWithAVPacket:(AVPacket)packet videoCodecContext:(AVCodecContext *)videoCodecContext videoFrame:(AVFrame *)videoFrame baseTime:(int64_t)baseTime videoStreamIndex:(int)videoStreamIndex {
    Float64 current_timestamp=[self getCurrentTimestamp];
    AVStream *videoStream=m_formatContext->streams[videoStreamIndex];
    int fps=DecodeGetAVStreamFPSTimeBase(videoStream);
    
    
    avcodec_send_packet(videoCodecContext, &packet);
    while (0==avcodec_receive_frame(videoCodecContext, videoFrame))
    {
        CVPixelBufferRef pixelBuffer=(CVPixelBufferRef)videoFrame->data[3];
        CMTime presentationTimeStamp=kCMTimeInvalid;
        int64_t originPTS=videoFrame->pts;
        int64_t newPTS=originPTS - baseTime;
        presentationTimeStamp=CMTimeMakeWithSeconds(current_timestamp + newPTS * av_q2d(videoStream->time_base) , fps);
        CMSampleBufferRef sampleBufferRef=[self convertCVImageBufferRefToCMSampleBufferRef:(CVPixelBufferRef)pixelBuffer
                                                                   withPresentationTimeStamp:presentationTimeStamp];
        
        if (sampleBufferRef) {
            if ([self.delegate respondsToSelector:@selector(getDecodeVideoDataByFFmpeg:)]) {
                [self.delegate getDecodeVideoDataByFFmpeg:sampleBufferRef];
            }
            
            CFRelease(sampleBufferRef);
        }
    }
}

avcodec_send_packet: 將壓縮視頻幀數據送給解碼器

AVERROR(EAGAIN): 當前狀態下不接受輸入,用戶必須通過avcodec_receive_frame()讀取輸出的buffer. (一旦所有輸出讀取完畢,packet應該被重新發送,調用不會失敗)
AVERROR_EOF: 解碼器已經被刷新,沒有新的packet能發送給它.
AVERROR(EINVAL): 解碼器沒有被打開
AVERROR(ENOMEM): 將Packet添加到內部隊列失敗.

avcodec_receive_frame: 從解碼器中獲取解碼后的數據

AVERROR(EAGAIN): 輸出不可用, 用戶必須嘗試發送一個新的輸入數據
AVERROR_EOF: 解碼器被完全刷新,這兒沒有更多的輸出幀
AVERROR(EINVAL): 解碼器沒有被打開.
其他負數: 解碼錯誤.

4. 停止解碼

釋放相關資源

篇

因為公司項目需要，研究了下騰訊的實時音視頻的C# Winform版本的Demo。

當然其他平臺也有，不是我主要負責。

經過2天的摸索，對其代碼和原理進行了一個簡單的梳理。因為才接觸騰訊的音視頻直播，同時C# Winform相關的知識已經5年沒碰了。

所以下面的內容，應該會出現一些偏差，僅供大家參考。

騰訊的Demo 下載地址：

核心類解讀

其實整個項目的最為核心的文件是：TXLiteAVVideoViews。翻譯過來的意思應該是騰訊輕量級的音視頻視圖。

在這個文件中，包含了3個非常重要的類，我認為音視頻在C#`層面上的核心。

TXLiteAVVideoView，繼承自Pannel，將視頻幀渲染到空間上。
TXLiteAVVideoViewManager，繼承自ITRTCVideoRenderCallBack，承上啟下，將底層傳遞的幀數據再傳遞給TXLiteAVVideoView
FrameBufferInfo幀數據的封裝實體。

TXLiteAVVideoView

以下為我把非核心代碼刪除后的簡單模型。

public class TXLiteAVVideoView: Panel {

	// 幀緩存,可以理解為視頻暫停時候的一個畫面
	private volatile FrameBufferInfo _mArgbFrame=new FrameBufferInfo();

    public bool AppendVideoFrame(byte[] data, int width, int height, TRTCVideoPixelFormat videoFormat, TRTCVideoRotation rotation) {
		//...
	}
    
	protected override void OnPaint(PaintEventArgs pe) {
		//....
	}
}

該類繼承了Pannl，又重寫了OnPaint，所以可以猜測目的是為了根據Frame數據來繪圖。
_mArgbFrame 的作用是保存的某一時刻的一幀數據，保存起來的目的是為了方便OnPaint來繪圖，它由什么地方傳遞過來的了，我們看下面這段話？
AppendVideoFrame由TXLiteAVVideoViewManager來調用，其中就傳入了byte[] data這個還沒有處理的幀的數據。

所以由此我們可以簡單分析總結下：該類通過方法AppendVideoFrame接收TXLiteAVVideoViewManager傳遞過來的幀數據，在將幀數據保存到局部變量_mArgbFrame后調用refresh方法，該方法會調用重寫后的OnPaint來畫圖。

TXLiteAVVideoViewManager

同樣簡化下代碼：

class TXLiteAVVideoViewManager: ITRTCVideoRenderCallback {
    
	private volatile Dictionary<string,TXLiteAVVideoView> _mMapViews;

	public void onRenderVideoFrame(string userId, TRTCVideoStreamType streamType, TRTCVideoFrame frame) {
        //...
    }
}

該類實現了接口ITRTCVideoRenderCallback的方法onRenderVideoFrame。通過簽名，我大膽的猜測了從服務器拉數據的時候，數據中應該有遠程用戶的id，以及對應的數據幀和類型。
其底層可能在不停的拉數據，然后不停的調用這個實現類來傳遞給對應的TXLiteAVVideoView進行視圖渲染。
_mMapViews這個局部變量，通過userId-streamType來作為key，其TXLiteAVVideView作為value來保存的數據。

我們可以簡單看看onRenderVideoFrame的實現

public void onRenderVideoFrame(string userId, TRTCVideoStreamType streamType, TRTCVideoFrame frame) {
	//....
	TXLiteAVVideoView view=null;
	lock(_mMapViews) {
		view=_mMapViews[GetKey(userId, streamType)];
	}
	//調用 AppendVideoFrame 進行幀的渲染
	view?.AppendVideoFrame(frame.data, (int) frame.width, (int) frame.height, frame.videoFormat, frame.rotation);
}

其本質也是從Dictionary中通過GetKey(userId, streamType)來構成key，獲取到對應的view，然后進行AppendVideoFrame.

FrameBufferInfo

這個類的實現如下：

class FrameBufferInfo
{
        public byte[] data { get; set; }

        public int width { get; set; }

        public int height { get; set; }

        public bool newFrame { get; set; }

        /**
         * Rotation 是是否旋轉
         */
        public TRTCVideoRotation rotation { get; set; }
}

表示的應該是這個幀如何處理。

結語

這是騰訊音視頻實時通信的第一篇分析，后面會根據情況，看看有沒有更多有意義的可以寫文。

希望對大家有幫助。

如果本文對你有幫助，歡迎評論、點贊、轉發、收藏、關注！

在線咨詢

上一篇：java-web中jsp的理解
下一篇：探索未知邊界：你的網頁發帖效率提升神器？

您的項目需求

*請認真填寫需求信息，我們會在24小時內與您取得聯系。

整合營銷服務商