Conditional Variational Autoencoders for Hierarchical B-frame Coding

Zong Lin Gao*, Cheng Wei Chen, Yi Chen Yao, Cheng Yuan Ho, Wen Hsiao Peng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In response to the Grand Challenge on Neural Network-based Video Coding at ISCAS 2024, this paper proposes a learned hierarchical B-frame coding scheme. Most learned video codecs concentrate on P-frame coding for the RGB content, while B-frame coding for the YUV420 content remains largely under-explored. Some early works explore Conditional Augmented Normalizing Flows (CANF) for B-frame coding. However, they suffer from high computational complexity because of stacking multiple variational autoencoders (VAE) and using separate Y and UV codecs. This work aims to develop a lightweight VAE-based B-frame codec in a conditional coding framework. It features (1) extracting multi-scale features for conditional motion and inter-frame coding, (2) performing frame-type adaptive coding for better bit allocation, and (3) a lightweight conditional VAE backbone that encodes YUV420 content by a simple conversion into YUV444 content for joint Y and UV coding. Experimental results confirms its superior compression performance to the CANF-based B-frame codec from the last year's challenge while having much reduced complexity.

Original languageEnglish
Title of host publicationISCAS 2024 - IEEE International Symposium on Circuits and Systems
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350330991
DOIs
StatePublished - 2024
Event2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024 - Singapore, Singapore
Duration: 19 May 202422 May 2024

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
ISSN (Print)0271-4310

Conference

Conference2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024
Country/TerritorySingapore
CitySingapore
Period19/05/2422/05/24

Keywords

  • B-frame coding
  • Learned video coding
  • YUV420

Fingerprint

Dive into the research topics of 'Conditional Variational Autoencoders for Hierarchical B-frame Coding'. Together they form a unique fingerprint.

Cite this