Refining Valence-Arousal Estimation with Dual-Stream Label Density Smoothing

Hongxia Xie*, I. Hsuan Li, Ling Lo, Hong Han Shuai, Wen Huang Cheng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Emotion recognition through facial expressions remains a long-standing research pursuit, yet the challenges persist, particularly in dynamic real-world scenarios. In-The-wild datasets are hampered by limited emotion annotations due to resource constraints, hindering multi-Task methodology advancements. Recent years have witnessed a surge of approaches addressing the valence-Arousal problem. However, data imbalance, especially in valence-Arousal annotation, persists. This work proposes a novel two-stream valence-Arousal estimation network, inspired by MIMAMO Net, leveraging spatial and temporal learning to enhance emotion recognition. Label Density Smoothing (LDS) is introduced to counter skewed distributions. Experimental results showcase the approach's efficacy, achieving a Concordance Correlation Coefficient (CCC) of 0.591 for valence and 0.617 for arousal on the Aff-Wild2 validation set. This work contributes to the advancement of valence-Arousal modeling in facial expression recognition.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Consumer Electronics, ICCE 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350324136
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Consumer Electronics, ICCE 2024 - Las Vegas, United States
Duration: 6 Jan 20248 Jan 2024

Publication series

NameDigest of Technical Papers - IEEE International Conference on Consumer Electronics
ISSN (Print)0747-668X
ISSN (Electronic)2159-1423

Conference

Conference2024 IEEE International Conference on Consumer Electronics, ICCE 2024
Country/TerritoryUnited States
CityLas Vegas
Period6/01/248/01/24

Fingerprint

Dive into the research topics of 'Refining Valence-Arousal Estimation with Dual-Stream Label Density Smoothing'. Together they form a unique fingerprint.

Cite this