Search CORE

9 research outputs found

Identification of Cognitive Decline from Spoken Language through Feature Selection and the Bag of Acoustic Words Model

Author: Kärkkäinen Tommi
Niemelä Marko
von Bonsdorff Mikaela
Äyrämö Sami
Publication venue
Publication date: 02/02/2024
Field of study

Memory disorders are a central factor in the decline of functioning and daily activities in elderly individuals. The confirmation of the illness, initiation of medication to slow its progression, and the commencement of occupational therapy aimed at maintaining and rehabilitating cognitive abilities require a medical diagnosis. The early identification of symptoms of memory disorders, especially the decline in cognitive abilities, plays a significant role in ensuring the well-being of populations. Features related to speech production are known to connect with the speaker's cognitive ability and changes. The lack of standardized speech tests in clinical settings has led to a growing emphasis on developing automatic machine learning techniques for analyzing naturally spoken language. Non-lexical but acoustic properties of spoken language have proven useful when fast, cost-effective, and scalable solutions are needed for the rapid diagnosis of a disease. The work presents an approach related to feature selection, allowing for the automatic selection of the essential features required for diagnosis from the Geneva minimalistic acoustic parameter set and relative speech pauses, intended for automatic paralinguistic and clinical speech analysis. These features are refined into word histogram features, in which machine learning classifiers are trained to classify control subjects and dementia patients from the Dementia Bank's Pitt audio database. The results show that achieving a 75% average classification accuracy with only twenty-five features with the separate ADReSS 2020 competition test data and the Leave-One-Subject-Out cross-validation of the entire competition data is possible. The results rank at the top compared to international research, where the same dataset and only acoustic features have been used to diagnose patients

arXiv.org e-Print Archive

Minimal Learning Machine: Theoretical Results and Clustering-Based Reference Point Selection

Author: Alencar Alisson S. C.
Gomes João P. P.
Hämäläinen Joonas
Júnior Amauri H. Souza
Kärkkäinen Tommi
Mattos César L. C.
Publication venue
Publication date: 01/01/2020
Field of study

The Minimal Learning Machine (MLM) is a nonlinear supervised approach based on learning a linear mapping between distance matrices computed in the input and output data spaces, where distances are calculated using a subset of points called reference points. Its simple formulation has attracted several recent works on extensions and applications. In this paper, we aim to address some open questions related to the MLM. First, we detail theoretical aspects that assure the interpolation and universal approximation capabilities of the MLM, which were previously only empirically verified. Second, we identify the task of selecting reference points as having major importance for the MLM's generalization capability. Several clustering-based methods for reference point selection in regression scenarios are then proposed and analyzed. Based on an extensive empirical evaluation, we conclude that the evaluated methods are both scalable and useful. Specifically, for a small number of reference points, the clustering-based methods outperformed the standard random selection of the original MLM formulation.Comment: 29 pages, Accepted to JML

arXiv.org e-Print Archive

Jyväskylä University Digital Archive

Recommended from our members

Gold–Thiolate Nanocluster Dynamics and Intercluster Reactions Enabled by a Machine Learned Interatomic Potential

Author: Häkkinen Hannu
Malola Sami
McCandler Caitlin A
Persson Kristin A
Pihlajamäki Antti
Publication venue: eScholarship, University of California
Publication date: 23/07/2024
Field of study

Monolayer protected metal clusters comprise a rich class of molecular systems and are promising candidate materials for a variety of applications. While a growing number of protected nanoclusters have been synthesized and characterized in crystalline forms, their dynamical behavior in solution, including prenucleation cluster formation, is not well understood due to limitations both in characterization and first-principles modeling techniques. Recent advancements in machine-learned interatomic potentials are rapidly enabling the study of complex interactions such as dynamical behavior and reactivity on the nanoscale. Here, we develop an Au-S-C-H atomic cluster expansion (ACE) interatomic potential for efficient and accurate molecular dynamics simulations of thiolate-protected gold nanoclusters (Aun(SCH3)m). Trained on more than 30,000 density functional theory calculations of gold nanoclusters, the interatomic potential exhibits ab initio level accuracy in energies and forces and replicates nanocluster dynamics including thermal vibration and chiral inversion. Long dynamics simulations (up to 0.1 μs time scale) reveal a mechanism explaining the thermal instability of neutral Au25(SR)18 clusters. Specifically, we observe multiple stages of isomerization of the Au25(SR)18 cluster, including a chiral isomer. Additionally, we simulate coalescence of two Au25(SR)18 clusters and observe series of clusters where the formation mechanisms are critically mediated by ligand exchange in the form of [Au-S]n rings

eScholarship - University of California

Functional extreme learning machine

Author: Guo Zhou
Qifang Luo
Qifang Luo
Xianli Liu
Yongquan Zhou
Yongquan Zhou
Yongquan Zhou
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2023
Field of study

IntroductionExtreme learning machine (ELM) is a training algorithm for single hidden layer feedforward neural network (SLFN), which converges much faster than traditional methods and yields promising performance. However, the ELM also has some shortcomings, such as structure selection, overfitting and low generalization performance.MethodsThis article a new functional neuron (FN) model is proposed, we takes functional neurons as the basic unit, and uses functional equation solving theory to guide the modeling process of FELM, a new functional extreme learning machine (FELM) model theory is proposed.ResultsThe FELM implements learning by adjusting the coefficients of the basis function in neurons. At the same time, a simple, iterative-free and high-precision fast parameter learning algorithm is proposed.DiscussionThe standard data sets UCI and StatLib are selected for regression problems, and compared with the ELM, support vector machine (SVM) and other algorithms, the experimental results show that the FELM achieves better performance

Directory of Open Access Journals

Extreme Minimal Learning Machine

Author: Kärkkäinen Tommi
Publication venue: ESANN
Publication date: 28/01/2019
Field of study

Extreme Learning Machine (ELM) and Minimal Learning Machine (MLM) are nonlinear and scalable machine learning techniques with randomly generated basis. Both techniques share a step where a matrix of weights for the linear combination of the basis is recovered. In MLM, the kernel in this step corresponds to distance calculations between the training data and a set of reference points, whereas in ELM transformation with a sigmoidal activation function is most commonly used. MLM then needs additional interpolation step to estimate the actual distance-regression based output. A natural combination of these two techniques is proposed here, i.e., to use a distance-based kernel characteristic in MLM in ELM. The experimental results show promising potential of the proposed technique.Extreme Learning Machine (ELM) and Minimal Learning Machine (MLM) are nonlinear and scalable machine learning techniques with randomly generated basis. Both techniques share a step where a matrix of weights for the linear combination of the basis is recovered. In MLM, the kernel in this step corresponds to distance calculations between the training data and a set of reference points, whereas in ELM transformation with a sigmoidal activation function is most commonly used. MLM then needs additional interpolation step to estimate the actual distance-regression based output. A natural combination of these two techniques is proposed here, i.e., to use a distance-based kernel characteristic in MLM in ELM. The experimental results show promising potential of the proposed technique.peerReviewe

Jyväskylä University Digital Archive

Model selection for Extreme Minimal Learning Machine using sampling

Author: Kärkkäinen Tommi
Publication venue: ESANN
Publication date: 01/01/2019
Field of study

A combination of Extreme Learning Machine (ELM) and Minimal Learning Machine (MLM)—to use a distance-based basis from MLM in the ridge regression like learning framework of ELM—was proposed in [8]. In the further experiments with the technique [9], it was concluded that in multilabel classification one can obtain a good validation error level without overlearning simply by using the whole training data for constructing the basis. Here, we consider possibilities to reduce the complexity of the resulting machine learning model, referred as the Extreme Minimal Leaning Machine (EMLM), by using a bidirectional sampling strategy: To sample both the feature space and the space of observations in order to identify a simpler EMLM without sacrificing its generalization performance.peerReviewe

Jyväskylä University Digital Archive

Extreme minimal learning machine : Ridge regression with distance-based basis

Author: Kärkkäinen Tommi
Publication venue: 'Elsevier BV'
Publication date: 15/04/2019
Field of study

The extreme learning machine (ELM) and the minimal learning machine (MLM) are nonlinear and scalable machine learning techniques with a randomly generated basis. Both techniques start with a step in which a matrix of weights for the linear combination of the basis is recovered. In the MLM, the feature mapping in this step corresponds to distance calculations between the training data and a set of reference points, whereas in the ELM, a transformation using a radial or sigmoidal activation function is commonly used. Computation of the model output, for prediction or classification purposes, is straightforward with the ELM after the first step. In the original MLM, one needs to solve an additional multilateration problem for the estimation of the distance-regression based output. A natural combination of these two techniques is proposed and experimented here: to use the distance-based basis characteristic in the MLM in the learning framework of the regularized ELM. In other words, we conduct ridge regression using a distance-based basis. The experimental results characterize the basic features of the proposed technique and surprisingly, indicate that overlearning with the distance-based basis is in practice avoided in classification problems. This makes the model selection for the proposed method trivial, at the expense of computational costs.peerReviewe

Jyväskylä University Digital Archive

Problem Transformation Methods with Distance-Based Learning for Multi-Target Regression

Author: Hämäläinen Joonas
Kärkkäinen Tommi
Publication venue: ESANN
Publication date: 01/01/2020
Field of study

Multi-target regression is a special subset of supervised machine learning problems. Problem transformation methods are used in the field to improve the performance of basic methods. The purpose of this article is to test the use of recently popularized distance-based methods, the minimal learning machine (MLM) and the extreme minimal learning machine (EMLM), in problem transformation. The main advantage of the full data variants of these methods is the lack of any meta-parameter. The experimental results for the MLM and EMLM show promising potential, emphasizing the utility of the problem transformation especially with the EMLM.peerReviewe

Jyväskylä University Digital Archive

Advancing nanomaterials design using novel machine learning methods

Author: Linja Joakim
Publication venue: Jyväskylän yliopisto
Publication date: 01/01/2023
Field of study

Datan käsittely on mullistunut koneoppimismenetelmien yleistymisen myötä. Koneoppimiselle löydetään jatkuvasti uusia sovelluskohteita ja uusia sovellustapoja. Yksi näistä sovelluskohteista löytyy nanotieteen puolelta. Nanotiede on alati laajeneva tieteenala, jonka vaikutuksia löytää nykyään melkein jokaisesta elämän osa-alueesta, kuten lääketieteestä, materiaalisuunnittelusta ja kuluttajatuotteista. Nanotieteen kokeellinen tutkimus on kuitenkin kallista, mutta tätä voidaan lieventää laskennallisen tieteen keinoja hyödyntäen. Laskennallisen tieteen keinot nanotieteen saralla ovat kuitenkin itsessään raskaita ja aikaavieviä, johtuen tutkimuksen vaatimasta tarkkuustasosta. Laskennallisen tieteen resurssivaadetta voidaan keventää koneoppimisen keinoin. Tässä työssä ja mukaanotetuissa artikkeleissa keskitytään tarkastelemaan etäisyyspohjaisten koneoppimismenetelmien perhettä laskennallisen nanotieteen kontekstissa. Erityisesti yhden kerroksen suojaamien nanoklusterien (monolayer protected cluster, MPC) kontekstissa. Käytettyyn koneoppimismenetelmien perheeseen kuuluvat Minimal Learning Machine (MLM) ja Extreme Minimal Learning Machine (EMLM). MLM:n ja EMLM:n toimivuutta ja suorituskykyä tutkitaan sijaismalleina, sekä muuttujanvalinnassa että tietämyksen tuottamisessa. Tutkimuksessa käytettiin aineistoja, joihin kuuluu suorituskykymittaukseen käytetyt, generoidut sekä molekyylidynamiikkasimulaatioon perustuvat aineistot. MLM:ää tutkittiin käyttämällä sitä sijaismallina sekä tutkimalla sen toimintaa eri yhtälönratkaisijoiden avulla. EMLM:ää käytettiin muuttujanvalinnassa sekä tietämyksen tuottamisessa. Tutkimusta varten luotiin skaalausominaisuuksia luotaava, Au38(SCH3)24 MPC klusteriin perustuva joukko aineistoja sekä joukko synteettisiä aineistoja, joiden tarkoituksena on toimia suorituskykymittauksessa sekä menetelmänkehityksessä muuttujanvalinta-algoritmeille. Tutkimuksessa kehitettiin kaksi Mean Absolute Sensitivity (MAS)-pohjaista muuttujanvalinta-algoritmia: Distance-based one-shot wrapper sekä sen laajennos, Feature Importance Detector. Etäisyyspohjainen muuttujanvalinta-algoritmi kontekstualisoitiin muuhun lähdekirjallisuuteen laajan koosteartikkelien koosteen avulla. Tulokset osoittavat MLM:n ja EMLM:n soveltuvuuden laskennallisen nanotieteen vaatimuksiin. Avainsanat: Koneoppiminen, Etäisyyspohjainen regressio, nanotiede, MLM, EMLM, Hybridinanopartikkelit, Muuttujanvalinta, Tietämyksen muodostusThe rise of machine learning (ML) has revolutionized the usage of data. Researchers continue to develop new ways to use ML and find new targets to apply ML on. One of these areas of application is found in nanoscience. Nanoscience is a constantly expanding field with applications in almost every part of life, such as medicine, materials design, and consumer products. The experimental research of nanoscience is expensive, augmented by computational research. Computational research is, however, also resource-intensive and time-consuming due to the complexity of the simulation models. Machine learning promises to alleviate that strain. This work and the articles presented focus on a family of distance-based machine learning algorithms, Minimal Learning Machine (MLM), and Extreme Minimal Learning Machine (EMLM), in the context of computational nanoscience. Specifically in the context of monolayer protected nanoclusters (MPC). The distance-based ML methods are studied as surrogates in feature selection and knowledge discovery. A set of benchmark, generated, and molecular dynamics-based datasets were used in the included articles. The performance of MLM was studied by using it as a surrogate, comparing it to other methods, and inspecting the effect of a solver on its function. EMLM was used as the ML model in feature selection and knowledge discovery. A set of scaling-focused benchmark datasets were developed based on the simulation data of Au38(SCH3)24 MPC and a set of synthetic benchmark & development datasets were created to test the performance of a feature selection algorithm. A Mean Absolute Sensitivity (MAS) utilizing distance-based feature selection algorithm, Distance-based one-shot wrapper, was developed and then extended to Feature Importance Detector. An umbrella review was made to contextualize the one-shot wrapper to feature selection literature. The results prove the viability of distance-based ML methods in the context of computational nanoscience. Keywords: Machine Learning, Distance–Based Regression, Nanoscience, MLM, EMLM, Hybrid Nanoparticles, Feature Selection, Knowledge discover

Jyväskylä University Digital Archive