Recent studies have demonstrated that analysis of laboratory-quality voice
recordings can be used to accurately differentiate people diagnosed with
Parkinson's disease (PD) from healthy controls (HC). These findings could help
facilitate the development of remote screening and monitoring tools for PD. In
this study, we analyzed 2759 telephone-quality voice recordings from 1483 PD
and 15321 recordings from 8300 HC participants. To account for variations in
phonetic backgrounds, we acquired data from seven countries. We developed a
statistical framework for analyzing voice, whereby we computed 307 dysphonia
measures that quantify different properties of voice impairment, such as,
breathiness, roughness, monopitch, hoarse voice quality, and exaggerated vocal
tremor. We used feature selection algorithms to identify robust parsimonious
feature subsets, which were used in combination with a Random Forests (RF)
classifier to accurately distinguish PD from HC. The best 10-fold
cross-validation performance was obtained using Gram-Schmidt Orthogonalization
(GSO) and RF, leading to mean sensitivity of 64.90% (standard deviation, SD
2.90%) and mean specificity of 67.96% (SD 2.90%). This large-scale study is a
step forward towards assessing the development of a reliable, cost-effective
and practical clinical decision support tool for screening the population at
large for PD using telephone-quality voice.Comment: 43 pages, 5 figures, 6 table