Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

Kang, Sang-Ick; Lee, Sangmin

doi:10.3390/sym10110605

상세 보기

Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

Kang, Sang-Ick;
Lee, Sangmin

Citations

WEB OF SCIENCE

7

Citations

SCOPUS

6

초록

The competition of speech recognition technology related to smartphones is now getting into full swing with the widespread internet of thing (IoT) devices. For robust speech recognition, it is necessary to detect speech signals in various acoustic environments. Speech/music classification that facilitates optimized signal processing from classification results has been extensively adapted as an essential part of various electronics applications, such as multi-rate audio codecs, automatic speech recognition, and multimedia document indexing. In this paper, we propose a new technique to improve robustness of a speech/music classifier for an enhanced voice service (EVS) codec adopted as a voice-over-LTE (VoLTE) speech codec using long short-term memory (LSTM). For effective speech/music classification, feature vectors implemented with the LSTM are chosen from the features of the EVS. To overcome the diversity of music data, a large scale of data is used for learning. Experiments show that LSTM-based speech/music classification provides better results than the conventional EVS speech/music classification algorithm in various conditions and types of speech/music data, especially at lower signal-to-noise ratio (SNR) than conventional EVS algorithm.

키워드

speech/music classification; Enhanced Voice Service; long short-term memory; big data

제목: Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

저자: Kang, Sang-Ick; Lee, Sangmin

DOI: 10.3390/sym10110605

발행일: 2018-11

유형: Article

저널명: Symmetry

권: 10

호: 11