Advanced optimization strategies for combining acoustic features and speech recognition error rates in multi-stage classification of Parkinson's disease severity

Citations

WEB OF SCIENCE

1
Citations

SCOPUS

2

초록

Recent research has made significant progress with definitively identifying individuals with Parkinson's disease (PD) using speech analysis techniques. However, these studies have often treated the early and advanced stages of PD as equivalent, overlooking the distinct speech impairments and symptoms that can vary significantly across the various stages. This research aims to enhance diagnostic accuracy by utilizing advanced optimization strategies to combine speech recognition results (character error rates) with the acoustic features of vowels for more rigorous diagnostic precision. The dysphonia features of three sustained Korean vowels /(sic)/ (a), /(sic)/ (i), and /(sic)/ (u) were examined for their diversity and strong correlations. Four recognized machine-learning classifiers: Random Forest, Support Vector Machine, k-Nearest Neighbors, and Multi-Layer Perceptron, were employed for consistent and reliable analysis. By fine-tuning the Whisper model specifically for PD speech recognition and optimizing it for each severity level of PD, we significantly improved the discernibility between PD severity levels. This enhancement, when combined with vowel data, allowed for a more precise classification, achieving an improved detection accuracy of 5.87% for a 3-level severity classification over the PD "ON"-state dataset, and an improved detection accuracy of 7.8% for a 3-level severity classification over the PD "OFF"-state dataset. This comprehensive approach not only evaluates the effectiveness of different feature extraction methods but also minimizes the variance across final classification models, thus detecting varying severity levels of PD more effectively.

키워드

Voice biomarkersAutomatic speech recognitionMultistage Parkinson's diseaseMachine learning classifiersDIAGNOSIS
제목
Advanced optimization strategies for combining acoustic features and speech recognition error rates in multi-stage classification of Parkinson's disease severity
저자
Mondol, S. I. M. M. RatonKim, RyulLee, Sangmin
DOI
10.1007/s13534-025-00465-9
발행일
2025-05
유형
Article
저널명
Biomedical Engineering Letters (BMEL)
15
3
페이지
497 ~ 511