상세 보기
Cohort-Sensitive Labeling: An Effective Approach for Enhancing ASR Performance
- Na, Jonghwan;
- Hasegawa-Johnson, Mark;
- Lee, Bowon
WEB OF SCIENCE
0SCOPUS
0초록
This paper proposes a cohort-sensitive labeling (CSL) for automatic speech recognition (ASR). CSL is a method that distinguishes data labels based on cohorts, allowing models to learn cohort-specific information. For evaluation, we applied CSL using gender information in the training data of LibriSpeech dataset. Experimental results demonstrate that the CSL-based approach outperforms methods without CSL, given sufficient training data. Specifically, our method achieved average word error rate reduction (WERR) of 1.81% on the LibriSpeech testclean and 5.76% on test-other datasets, when more than 100 hours of data were used for training. Moreover, on TIMIT and Common Voice test sets, it achieved WERR of up to 11.52% and 2.91%, respectively demonstrating its robustness and generalizability to unseen data. Additionally, the proposed method reached up to 97.21% accuracy in classifying the gender cohort, suggesting that ASR models trained with the CSL effectively leverage the cohort information.
키워드
- 제목
- Cohort-Sensitive Labeling: An Effective Approach for Enhancing ASR Performance
- 저자
- Na, Jonghwan; Hasegawa-Johnson, Mark; Lee, Bowon
- 발행일
- 2025
- 유형
- Proceedings Paper