Deep Recurrent Neural Network and Psychoacoustic Modeling for Speech Enhancement

SANGMIN LEE

상세 보기

Deep Recurrent Neural Network and Psychoacoustic Modeling for Speech Enhancement

SANGMIN LEE

초록

Abstract - Monaural speech enhancement is one of the important topics in signal processing because it can be used for many real-world applications. In this paper, we propose a monaural speech enhancement method that is combination of deep recurrent neural network (DRNN) to separate clean speech from noisy speech, and time-frequency psychoacoustic modeling for speech enhancement. For separating clean speech, 2 schemes of deep recurrent neural network with 3 hidden layers are used in this study. One is with fully temporal connection called stacked RNN and other is with temporal connection at a specific layer. Results of each network are compared each other and to result without psychoacoustic masking.

제목: Deep Recurrent Neural Network and Psychoacoustic Modeling for Speech Enhancement

저자: SANGMIN LEE

학회명: The 2016 International Conference on Artificial Intelligence (ICAI'16)

개최지: 미국 라스베가스

학회 개최일: 2016-07-24 ~ 2016-07-28