Memory disorders are a central factor in the decline of functioning and daily
activities in elderly individuals. The confirmation of the illness, initiation
of medication to slow its progression, and the commencement of occupational
therapy aimed at maintaining and rehabilitating cognitive abilities require a
medical diagnosis. The early identification of symptoms of memory disorders,
especially the decline in cognitive abilities, plays a significant role in
ensuring the well-being of populations. Features related to speech production
are known to connect with the speaker's cognitive ability and changes. The lack
of standardized speech tests in clinical settings has led to a growing emphasis
on developing automatic machine learning techniques for analyzing naturally
spoken language. Non-lexical but acoustic properties of spoken language have
proven useful when fast, cost-effective, and scalable solutions are needed for
the rapid diagnosis of a disease. The work presents an approach related to
feature selection, allowing for the automatic selection of the essential
features required for diagnosis from the Geneva minimalistic acoustic parameter
set and relative speech pauses, intended for automatic paralinguistic and
clinical speech analysis. These features are refined into word histogram
features, in which machine learning classifiers are trained to classify control
subjects and dementia patients from the Dementia Bank's Pitt audio database.
The results show that achieving a 75% average classification accuracy with only
twenty-five features with the separate ADReSS 2020 competition test data and
the Leave-One-Subject-Out cross-validation of the entire competition data is
possible. The results rank at the top compared to international research, where
the same dataset and only acoustic features have been used to diagnose
patients