Text-to-Speech with Model Compression on Edge Devices

Wai Wan Koc, Yung Ting Chang, Jian Yu Yu, Tsi Ui Ik

研究成果: Conference contribution同行評審

摘要

The application of voice services has become more common in daily life, including traffic navigation, voice assistants, audio books and so on. However, considering the cost and variability, it is difficult to fully utilize real voice recordings in different scenarios. In practice, speech synthesis technology is usually used to mimic human voices; On the other hand, with the development of computer equipment, the computing power of edge devices has also gradually improved, which enables light deep-learning network inference. Currently, many deep learning technologies have been ported to edge devices to create different applications, such as face recognition, speech recognition, and photo retouching. Therefore, if the speech synthesis network is ported to edge devices, with the advent of the fifth generation mobile communication generation (5G), it would be able to provide more innovative basis for voice services. In this research, the speech synthesis network Tacotron2 [1] + CBHG [2] will be ported to edge device and aims to optimize the model inference time and amount of parameters. The model optimization would be based on the compression of deep learning network, quantization, structured pruning and low-rank matrix approximation techniques to allow the speech synthesis network working effectively on edge devices. On the other hand, we get over the difference in library support between TensorFlow 1.5 and TensorFlow Lite. After the compression of the model, the inference speed of the Tacotron2 speech synthesis network on edge device is increased by 1.91 times, while the model size is reduced by 86% respectively.

原文English
主出版物標題2021 22nd Asia-Pacific Network Operations and Management Symposium, APNOMS 2021
發行者Institute of Electrical and Electronics Engineers Inc.
頁面114-119
頁數6
ISBN(電子)9784885523328
DOIs
出版狀態Published - 8 9月 2021
事件22nd Asia-Pacific Network Operations and Management Symposium, APNOMS 2021 - Virtual, Online, Taiwan
持續時間: 8 9月 202110 9月 2021

出版系列

名字2021 22nd Asia-Pacific Network Operations and Management Symposium, APNOMS 2021

Conference

Conference22nd Asia-Pacific Network Operations and Management Symposium, APNOMS 2021
國家/地區Taiwan
城市Virtual, Online
期間8/09/2110/09/21

指紋

深入研究「Text-to-Speech with Model Compression on Edge Devices」主題。共同形成了獨特的指紋。

引用此