Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks

Kim, Sanghong; Lee, Bowon

doi:10.7776/ASK.2020.39.5.454

상세 보기

Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks

Kim, Sanghong;
Lee, Bowon

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Artificial intelligence assistants that provide speech recognition operate through cloud-based voice recognition with high accuracy. In cloud-based speech recognition, Wake-Up-Word (WUW) detection plays an important role in activating devices on standby. In this paper, we compare the performance of Convolutional Neural Network (CNN)-based WUW detection models for mobile devices by using Google's speech commands dataset, using the spectrogram and mel-frequency cepstral coefficient features as inputs. The CNN models used in this paper are multi-layer perceptron, general convolutional neural network, VGG16, VGG19, ResNet50, ResNet101, ResNet152, MobileNet. We also propose network that reduces the model size to 1/25 while maintaining the performance of MobileNet is also proposed.

키워드

Performance comparison; Wake-up-word detection; Convolutional neural network; Artificial Intelligence (AI) assistant

제목: Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks

저자: Kim, Sanghong; Lee, Bowon

DOI: 10.7776/ASK.2020.39.5.454

발행일: 2020

유형: Article

저널명: 한국음향학회지

권: 39

호: 5

페이지: 454 ~ 460