Comparative Analysis of Automatic Speech Recognition Fine-Tuning Strategies for Speech From Cochlear Implant Users

Yoon, Seojin; Kim, Hyunji; Kim, Kyusung; Lee, Sangmin

doi:10.1109/LSP.2025.3640524

상세 보기

Comparative Analysis of Automatic Speech Recognition Fine-Tuning Strategies for Speech From Cochlear Implant Users

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Although automatic speech recognition technology has become widespread, it still exhibits limited performance when processing speech from cochlear implant (CI) users. This limitation serves as a barrier that hinders CI users' accessibility to digital technology. To address this issue, a comparative study of fine-tuning strategies was conducted to effectively adapt Whisper, a general-purpose speech recognition model, to CI users' speech. Specifically, the performance of full fine-tuning, selective fine-tuning, adapter, and LoRA were evaluated based on Korean CI user's speech dataset. The experimental results showed that all the fine-tuning approaches improved recognition performance compared to the baseline Whisper model. Notably, LoRA-encoder approach, which involved training only 2.15% of the total parameters, achieved the best performance with a character error rate of 11.57%, demonstrating superior performance and efficiency. Furthermore, strategies that fine-tuned only the encoder consistently showed higher performance than those that adjusted the decoder, confirming that the encoder's role is crucial in modeling the unique acoustic characteristics of CI users' speech.

키워드

Automatic speech recognition; cochlear implant; parameter-efficient learning; parameter-efficient learning; Whisper; Whisper; Whisper

제목: Comparative Analysis of Automatic Speech Recognition Fine-Tuning Strategies for Speech From Cochlear Implant Users

저자: Yoon, Seojin; Kim, Hyunji; Kim, Kyusung; Lee, Sangmin

DOI: 10.1109/LSP.2025.3640524

발행일: 2026

유형: Article

저널명: IEEE Signal Processing Letters

권: 33

페이지: 236 ~ 240