Conditional Variational Autoencoders for Hierarchical B-frame Coding

Zong Lin Gao*, Cheng Wei Chen, Yi Chen Yao, Cheng Yuan Ho, Wen Hsiao Peng

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

In response to the Grand Challenge on Neural Network-based Video Coding at ISCAS 2024, this paper proposes a learned hierarchical B-frame coding scheme. Most learned video codecs concentrate on P-frame coding for the RGB content, while B-frame coding for the YUV420 content remains largely under-explored. Some early works explore Conditional Augmented Normalizing Flows (CANF) for B-frame coding. However, they suffer from high computational complexity because of stacking multiple variational autoencoders (VAE) and using separate Y and UV codecs. This work aims to develop a lightweight VAE-based B-frame codec in a conditional coding framework. It features (1) extracting multi-scale features for conditional motion and inter-frame coding, (2) performing frame-type adaptive coding for better bit allocation, and (3) a lightweight conditional VAE backbone that encodes YUV420 content by a simple conversion into YUV444 content for joint Y and UV coding. Experimental results confirms its superior compression performance to the CANF-based B-frame codec from the last year's challenge while having much reduced complexity.

原文English
主出版物標題ISCAS 2024 - IEEE International Symposium on Circuits and Systems
發行者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子)9798350330991
DOIs
出版狀態Published - 2024
事件2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024 - Singapore, Singapore
持續時間: 19 5月 202422 5月 2024

出版系列

名字Proceedings - IEEE International Symposium on Circuits and Systems
ISSN(列印)0271-4310

Conference

Conference2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024
國家/地區Singapore
城市Singapore
期間19/05/2422/05/24

指紋

深入研究「Conditional Variational Autoencoders for Hierarchical B-frame Coding」主題。共同形成了獨特的指紋。

引用此