Deep Recurrent Neural Network and Psychoacoustic Modeling for Speech Enhancement

초록

Abstract - Monaural speech enhancement is one of the important topics in signal processing because it can be used for many real-world applications. In this paper, we propose a monaural speech enhancement method that is combination of deep recurrent neural network (DRNN) to separate clean speech from noisy speech, and time-frequency psychoacoustic modeling for speech enhancement. For separating clean speech, 2 schemes of deep recurrent neural network with 3 hidden layers are used in this study. One is with fully temporal connection called stacked RNN and other is with temporal connection at a specific layer. Results of each network are compared each other and to result without psychoacoustic masking.

제목
Deep Recurrent Neural Network and Psychoacoustic Modeling for Speech Enhancement
저자
SANGMIN LEE
학회명
The 2016 International Conference on Artificial Intelligence (ICAI'16)
개최지
미국 라스베가스
학회 개최일
2016-07-24 ~ 2016-07-28