Wav2Vec 2.0 기반 파킨슨환자 음성인식

SANGMIN LEE

상세 보기

Wav2Vec 2.0 기반 파킨슨환자 음성인식

SANGMIN LEE

초록

In this paper, a speech recognition algorithm for speech on Parkinson's disease patients is proposed. Since the speech of the Parkinson's patients are significantly different from that of normal people, a speech recognition algorithm suitable for Parkinson’s disease patients is needed. In general, the amount of digitized speech data of Parkinson's disease patients is not sufficient to develop a speech recognition algorithm compared to that of normal people. The proposed algorithm consists of two steps, pre-training step and fine-tuning step. The proposed Wav2Vec 2.0-based algorithm is first pre-trained with a very large amount of speech data from normal people, and then fine-tuned with a small amount of speech data from Parkinson's disease patients. As a result of the experiment, the character error rate of the proposed speech recognition algorithm for Parkinson's patients was 35.80%, which was rather high, but this rate was improved compared to the conventional algorithms developed with only speech data of normal people. It is expected that the character error rate will be lowered through a more sophisticated fine-tuning process in the future.

제목: Wav2Vec 2.0 기반 파킨슨환자 음성인식

저자: SANGMIN LEE

학회명: 2022년 대한전자공학회 하계학술대회