Developing an Interview Recording System with Speaker Recognition and Emotion Classification

Wei Yi Hsieh, Hsun Ching Tsai, Lu An Chen, Tsì Uí İk*

*此作品的通信作者

研究成果: Conference contribution同行評審

摘要

Dialogue records play a crucial role in meetings and counseling, and the speaker’s information is also necessary. Furthermore, if the speaker’s emotions can be captured, the record can more faithfully reflect the speaker’s reactions at that time. This study aims to develop an interview recording system that integrates speaker recognition and speech emotion classification, providing speech transcription, speaker information, and sentence-level emotion recognition. The prototype system consists of three subsystems: the Meeting Recording System serves as the primary user interface for meeting recording and data statistics; the Data Labeling System is used to correct meeting records and as a tool for data collection; the Voiceprint Management System provides functions for speaker registration and voiceprint management. To train the multiemotion classification model, we re-labeled 11 hours of audio from the NNIME corpus. After performance evaluation, the F1-score of multi-label emotion classification can reach 0.5115, and speaker recognition accuracy can reach 96.39%, while the text records are generated using Microsoft Speech-To-Text API.

原文English
主出版物標題APNOMS 2023 - 24th Asia-Pacific Network Operations and Management Symposium
主出版物子標題Intelligent Management for Enabling the Digital Transformation
發行者Institute of Electrical and Electronics Engineers Inc.
頁面267-270
頁數4
ISBN(電子)9788995004395
出版狀態Published - 2023
事件24th Asia-Pacific Network Operations and Management Symposium, APNOMS 2023 - Sejong, 韓國
持續時間: 6 9月 20238 9月 2023

出版系列

名字APNOMS 2023 - 24th Asia-Pacific Network Operations and Management Symposium: Intelligent Management for Enabling the Digital Transformation

Conference

Conference24th Asia-Pacific Network Operations and Management Symposium, APNOMS 2023
國家/地區韓國
城市Sejong
期間6/09/238/09/23

指紋

深入研究「Developing an Interview Recording System with Speaker Recognition and Emotion Classification」主題。共同形成了獨特的指紋。

引用此