Advanced optimization strategies for combining acoustic features and speech recognition error rates in multi-stage classification of Parkinson's disease severity

Mondol, S. I. M. M. Raton; Kim, Ryul; Lee, Sangmin

doi:10.1007/s13534-025-00465-9

상세 보기

Advanced optimization strategies for combining acoustic features and speech recognition error rates in multi-stage classification of Parkinson's disease severity

Mondol, S. I. M. M. Raton;
Kim, Ryul;
Lee, Sangmin

Citations

WEB OF SCIENCE

1

Citations

SCOPUS

2

초록

Recent research has made significant progress with definitively identifying individuals with Parkinson's disease (PD) using speech analysis techniques. However, these studies have often treated the early and advanced stages of PD as equivalent, overlooking the distinct speech impairments and symptoms that can vary significantly across the various stages. This research aims to enhance diagnostic accuracy by utilizing advanced optimization strategies to combine speech recognition results (character error rates) with the acoustic features of vowels for more rigorous diagnostic precision. The dysphonia features of three sustained Korean vowels /(sic)/ (a), /(sic)/ (i), and /(sic)/ (u) were examined for their diversity and strong correlations. Four recognized machine-learning classifiers: Random Forest, Support Vector Machine, k-Nearest Neighbors, and Multi-Layer Perceptron, were employed for consistent and reliable analysis. By fine-tuning the Whisper model specifically for PD speech recognition and optimizing it for each severity level of PD, we significantly improved the discernibility between PD severity levels. This enhancement, when combined with vowel data, allowed for a more precise classification, achieving an improved detection accuracy of 5.87% for a 3-level severity classification over the PD "ON"-state dataset, and an improved detection accuracy of 7.8% for a 3-level severity classification over the PD "OFF"-state dataset. This comprehensive approach not only evaluates the effectiveness of different feature extraction methods but also minimizes the variance across final classification models, thus detecting varying severity levels of PD more effectively.

키워드

Voice biomarkers; Automatic speech recognition; Multistage Parkinson's disease; Machine learning classifiers; DIAGNOSIS

제목: Advanced optimization strategies for combining acoustic features and speech recognition error rates in multi-stage classification of Parkinson's disease severity

저자: Mondol, S. I. M. M. Raton; Kim, Ryul; Lee, Sangmin

DOI: 10.1007/s13534-025-00465-9

발행일: 2025-05

유형: Article

저널명: Biomedical Engineering Letters (BMEL)

권: 15

호: 3

페이지: 497 ~ 511