발화 속도와 휴지 구간 길이를 사용한 방언 분류

나종환; 이보원

doi:10.13064/KSSS.2023.15.2.043

상세 보기

발화 속도와 휴지 구간 길이를 사용한 방언 분류

Dialect classification based on the speed and the pause of speech utterances

나종환;
이보원

초록

In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

키워드

dialect classification; feature extraction; low resource conditions

제목: 발화 속도와 휴지 구간 길이를 사용한 방언 분류

제목 (타언어): Dialect classification based on the speed and the pause of speech utterances

저자: 나종환; 이보원

DOI: 10.13064/KSSS.2023.15.2.043

발행일: 2023-06

유형: Y

저널명: 말소리와 음성과학

권: 15

호: 2

페이지: 43 ~ 51