Group-and-Conquer for Multi-Speaker Single-Channel Speech Separation

Ya Fan Yen, Hong Han Shuai

研究成果: Conference contribution同行評審

摘要

We propose to reduce the difficulty level by separating the separation process in a group-and-conquer way. Specifically, in the first stage, we propose a prediction model to estimate the optimal number of groups based on the input mixed signal. In addition, we train a group separation model to separate a mixed signal into multiple groups according to the number of groups. By training a vocal network with the triplet cosine loss and a group separation network simultaneously, the proposed group separation model better learns the latent feature of each group. As such, given the predicted number of groups, the group separation model can automatically separate the input audio signal into several groups. In the second stage, for the group with more than one speaker, the separation model focuses on fine-grained information to better separate the speech among the group. Experimental results show that our approach outperforms the state-of-the-art models by at least 8.68% in SI-SNRi.

原文English
主出版物標題2024 33rd Wireless and Optical Communications Conference, WOCC 2024
發行者Institute of Electrical and Electronics Engineers Inc.
頁面165-169
頁數5
ISBN(電子)9798331539658
DOIs
出版狀態Published - 2024
事件33rd Wireless and Optical Communications Conference, WOCC 2024 - Hsinchu, 台灣
持續時間: 25 10月 202426 10月 2024

出版系列

名字2024 33rd Wireless and Optical Communications Conference, WOCC 2024

Conference

Conference33rd Wireless and Optical Communications Conference, WOCC 2024
國家/地區台灣
城市Hsinchu
期間25/10/2426/10/24

指紋

深入研究「Group-and-Conquer for Multi-Speaker Single-Channel Speech Separation」主題。共同形成了獨特的指紋。

引用此