TY - JOUR
T1 - A 14 μJ/Decision Keyword-Spotting Accelerator With In-SRAMComputing and On-Chip Learning for Customization
AU - Chiang, Yu Hsiang
AU - Chang, Tian Sheuan
AU - Jou, Shyh Jye
N1 - Publisher Copyright:
© 1993-2012 IEEE.
PY - 2022/9/1
Y1 - 2022/9/1
N2 - spotting (KWS) has gained popularity as a natural way to interact with consumer devices in recent years. However, because of its always on nature and the variety of speech, it necessitates a low-power design as well as user customization. This article describes a low-power, energy-efficient KWS accelerator with static random access memory (SRAM)-based in-memory computing (IMC) and on-chip learning for user customization. However, IMC is constrained by macro size, limited precision, and nonideal effects. To address the issues mentioned above, this article proposes bias compensation and fine-tuning using an IMC-aware model design. Furthermore, because learning with low-precision edge devices results in zero error and gradient values due to quantization, this article proposes error scaling and small gradient accumulation to achieve the same accuracy as ideal model training. The simulation results show that with user customization, we can recover the accuracy loss from 51.08% to 89.76% with compensation and fine-tuning and further improve to 96.71% with customization. The chip implementation can successfully run the model with only 14 μJ per decision. When compared to the state-of-the-art works, the presented design has higher energy efficiency with additional on-chip model customization capabilities for higher accuracy.
AB - spotting (KWS) has gained popularity as a natural way to interact with consumer devices in recent years. However, because of its always on nature and the variety of speech, it necessitates a low-power design as well as user customization. This article describes a low-power, energy-efficient KWS accelerator with static random access memory (SRAM)-based in-memory computing (IMC) and on-chip learning for user customization. However, IMC is constrained by macro size, limited precision, and nonideal effects. To address the issues mentioned above, this article proposes bias compensation and fine-tuning using an IMC-aware model design. Furthermore, because learning with low-precision edge devices results in zero error and gradient values due to quantization, this article proposes error scaling and small gradient accumulation to achieve the same accuracy as ideal model training. The simulation results show that with user customization, we can recover the accuracy loss from 51.08% to 89.76% with compensation and fine-tuning and further improve to 96.71% with customization. The chip implementation can successfully run the model with only 14 μJ per decision. When compared to the state-of-the-art works, the presented design has higher energy efficiency with additional on-chip model customization capabilities for higher accuracy.
KW - Model personalization
KW - on-chip training
KW - quantized training
UR - http://www.scopus.com/inward/record.url?scp=85132514611&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2022.3172685
DO - 10.1109/TVLSI.2022.3172685
M3 - Article
AN - SCOPUS:85132514611
SN - 1063-8210
VL - 30
SP - 1184
EP - 1192
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 9
ER -