TY - GEN
T1 - On the Rate-Distortion-Complexity Trade-Offs of Neural Video Coding
AU - Chen, Yi Hsin
AU - Ho, Kuan Wei
AU - Benjak, Martin
AU - Ostermann, Jorn
AU - Peng, Wen Hsiao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.
AB - This paper aims to delve into the rate-distortion-complexity trade-offs of modern neural video coding. Recent years have witnessed much research effort being focused on exploring the full potential of neural video coding. Conditional auto encoders have emerged as the mainstream approach to efficient neural video coding. The central theme of conditional auto encoders is to leverage both spatial and temporal information for better conditional coding. However, a recent study indicates that conditional coding may suffer from information bottlenecks, potentially performing worse than traditional residual coding. To address this issue, recent conditional coding methods incorporate a large number of high-resolution features as the condition signal, leading to a considerable increase in the number of multiply-accumulate operations, memory footprint, and model size. Taking DCVC as the common code base, we investigate how the newly proposed conditional residual coding, an emerging new school of thought, and its variants may strike a better balance among rate, distortion, and complexity.
KW - Learned video compression
KW - conditional coding
KW - conditional residual coding
UR - http://www.scopus.com/inward/record.url?scp=85211325094&partnerID=8YFLogxK
U2 - 10.1109/MMSP61759.2024.10743250
DO - 10.1109/MMSP61759.2024.10743250
M3 - Conference contribution
AN - SCOPUS:85211325094
T3 - 2024 IEEE 26th International Workshop on Multimedia Signal Processing, MMSP 2024
BT - 2024 IEEE 26th International Workshop on Multimedia Signal Processing, MMSP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th IEEE International Workshop on Multimedia Signal Processing, MMSP 2024
Y2 - 2 October 2024 through 4 October 2024
ER -