상세 보기
Towards Scalable and Robust Multilingual ASR for Indian Languages with MixLoRA-Whisper
- Park, Yeseul;
- Lee, Bowon
SCOPUS
0초록
India exhibits extensive linguistic diversity, with many regional languages and dialects, yet current multilingual automatic speech recognition (ASR) models provide limited support, especially for low-income and rural populations who rely on spoken communication. We apply MixLoRA, a parameterefficient fine-tuning method proposed for large language models, to Whisper to improve ASR performance. MixLoRA employs multiple LoRA experts and dynamically selects the most relevant experts per token, enabling better modeling of linguistic variation. By fine-tuning only up to 25.03 % of the parameters on the RESPIN dataset, which covers eight Indian languages with 33 dialects, it achieves a 4.98 % character error rate (CER) on the read speech, yielding a 7.09 % relative CER reduction over the baseline. Performance improved across all languages in read speech and five in spontaneous speech. These results demonstrate that MixLoRA can effectively enhance ASR for low-resource, dialect-rich languages. © 2025 IEEE.
키워드
- 제목
- Towards Scalable and Robust Multilingual ASR for Indian Languages with MixLoRA-Whisper
- 저자
- Park, Yeseul; Lee, Bowon
- 발행일
- 2025
- 유형
- Conference paper
- 저널명
- ASRU 2025 - 2025 IEEE Automatic Speech Recognition and Understanding Workshop