67 research outputs found
Enhancing Confusion Entropy (CEN) for Binary and Multiclass Classification
Different performance measures are used to assess the behaviour, and to carry out the comparison, of classifiers in Machine Learning. Many measures have been defined on the literature, and among them, a measure inspired by Shannon's entropy named the Confusion Entropy (CEN). In this work we introduce a new measure, MCEN, by modifying CEN to avoid its unwanted behaviour in the binary case, that disables it as a suitable performance measure in classification. We compare MCEN with CEN and other performance measures, presenting analytical results in some particularly interesting cases, as well as some heuristic computational experimentation.This work was supported by Ministerio de Economía y Competitividad, Gobierno de España, MTM2015 67802-P to R.D. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Recommended from our members
Modelling prognostic trajectories in Alzheimer’s disease
Progression to dementia due to Alzheimer’s Disease (AD) is a long and protracted process that involves multiple pathways of disease pathophysiology. Predicting these dynamic changes has major implications for timely and effective clinical management in AD. There are two reasons why at present we lack appropriate tools to make such predictions. First, a key feature of AD is the interactive nature of the relationships between biomarkers, such as accumulation of β-amyloid -a peptide that builds plaques between nerve cells-, tau -a protein found in the axons of nerve cells- and widespread neurodegeneration. Current models fail to capture these relationships because they are unable to successfully reduce the high dimensionality of biomarkers while exploiting informative multivariate relationships. Second, current models focus on simply predicting in a binary manner whether an individual will develop dementia due to AD or not, without informing clinicians about their predicted disease trajectory. This can result in administering inefficient treatment plans and hindering appropriate stratification for clinical trials. In this thesis, we overcome these challenges by using applied machine learning to build predictive models of patient disease trajectories in the earliest stages of AD. Specifically, to exploit the multi-dimensionality of biomarker data, we used a novel feature generation methodology Partial Least Squares regression with recursive feature elimination (PLSr-RFE). This method applies a hybrid-feature selection and feature construction method that captures co-morbidities in cognition and pathophysiology, resulting in an index of Alzheimer’s disease atrophy from structural MRI. We validated our choice of biomarker and the efficacy of our methodology by showing that the learnt pattern of grey matter atrophy is highly predictive of tau accumulation in an independent sample. Next, to go beyond predicting binary outcomes to deriving individualised prognostic scores of cognitive decline due to AD, we used a novel trajectory modelling approach (Generalised Metric Learning Vector Quantization – Scalar projection) that mines multimodal data from large AD research cohorts. Using this approach, we derive individualised prognostic scores of cognitive decline due to AD, revealing interactive cognitive, and biological factors that improve prediction accuracy. Next, we extended our machine learning framework to classify and stage early AD individuals based on future pathological tau accumulation. Our results show that the characteristic spreading pattern of tau in early AD can be predicted by baseline biomarkers, particularly when stratifying groups using multimodal data. Further, we showed that our prognostic index predicts individualised rates of future tau accumulation with high accuracy and regional specificity in an independent sample of cognitively unimpaired individuals. Overall, our work used machine learning to combine continuous information from AD biomarkers predicting pathophysiological changes at different stages in the AD cascade. The approaches presented in this thesis provide an excellent framework to support personalised clinical interventions and guide effective drug discovery trials
Plasma p217+tau versus NAV4694 amyloid and MK6240 tau PET across the Alzheimer's continuum
Introduction
We evaluated a new Simoa plasma assay for phosphorylated tau (P-tau) at aa217 enhanced by additional p-tau sites (p217+tau).
Methods
Plasma p217+tau levels were compared to 18F-NAV4694 amyloid beta (Aβ) positron emission tomography (PET) and 18F-MK6240 tau PET in 174 cognitively impaired (CI) and 223 cognitively unimpaired (CU) participants.
Results
Compared to Aβ− CU, the plasma levels of p217+tau increased 2-fold in Aβ+ CU and 3.5-fold in Aβ+ CI. In Aβ− the p217+tau levels did not differ significantly between CU and CI. P217+tau correlated with Aβ centiloids P = .67 (CI, P = .64; CU, P = .45) and tau SUVRMT P = .63 (CI, P = .69; CU, P = .34). Area under curve (AUC) for Alzheimer's disease (AD) dementia versus Aβ− CU was 0.94, for AD dementia versus other dementia was 0.93, for Aβ+ versus Aβ− PET was 0.89, and for tau+ versus tau− PET was 0.89.
Discussion
Plasma p217+tau levels elevate early in the AD continuum and correlate well with Aβ and tau PET
Plasma p217+tau versus NAV4694 amyloid and MK6240 tau PET across the Alzheimer\u27s continuum
Introduction: We evaluated a new Simoa plasma assay for phosphorylated tau (P-tau) at aa217 enhanced by additional p-tau sites (p217+tau). Methods: Plasma p217+tau levels were compared to 18F-NAV4694 amyloid beta (Aβ) positron emission tomography (PET) and 18F-MK6240 tau PET in 174 cognitively impaired (CI) and 223 cognitively unimpaired (CU) participants. Results: Compared to Aβ− CU, the plasma levels of p217+tau increased 2-fold in Aβ+ CU and 3.5-fold in Aβ+ CI. In Aβ− the p217+tau levels did not differ significantly between CU and CI. P217+tau correlated with Aβ centiloids P =.67 (CI, P =.64; CU, P =.45) and tau SUVRMT P =.63 (CI, P =.69; CU, P =.34). Area under curve (AUC) for Alzheimer\u27s disease (AD) dementia versus Aβ− CU was 0.94, for AD dementia versus other dementia was 0.93, for Aβ+ versus Aβ− PET was 0.89, and for tau+ versus tau− PET was 0.89. Discussion: Plasma p217+tau levels elevate early in the AD continuum and correlate well with Aβ and tau PET
Testing modified confusion entropy as split criterion for decision trees
In 2010, a new performance measure to evaluate the results obtained by algorithms of data classification was presented, Confusion Entropy (CEN). This render measure is able to achieve a greater discrimination than Accuracy focusing on the distribution across different classes of both correctly and wrongly classified instances, but it is not able to work correctly in cases of binary classification. Recently, an enhancement has been proposed to correct its behaviour in those cases, the Modified Confusion Entropy (MCEN).
In this work, we propose a new algorithm, MCENTree. This algorithm uses MCEN as splitting criterion to build a decision tree model instead of CEN, as proposed in the CENTree algorithm in the literature.
We make a comparison among a classic J48, CENTree and the new algorithm MCENTree in terms of Accuracy, CEN and MCEN performance measures, and we analyze how the undesired behaviour of CEN affects the results of the algorithms and how MCEN shows a good behaviour in terms of results: while MCENTree gives correct results in a statistical range [0,1], CENTree sometimes gives non monotonous and out of range results in binary class classification
Testing modified confusion entropy as split criterion for decision trees
In 2010, a new performance measure to evaluate the results obtained by algorithms of data classification was presented, Confusion Entropy (CEN). This render measure is able to achieve a greater discrimination than Accuracy focusing on the distribution across different classes of both correctly and wrongly classified instances, but it is not able to work correctly in cases of binary classification. Recently, an enhancement has been proposed to correct its behaviour in those cases, the Modified Confusion Entropy (MCEN).
In this work, we propose a new algorithm, MCENTree. This algorithm uses MCEN as splitting criterion to build a decision tree model instead of CEN, as proposed in the CENTree algorithm in the literature.
We make a comparison among a classic J48, CENTree and the new algorithm MCENTree in terms of Accuracy, CEN and MCEN performance measures, and we analyze how the undesired behaviour of CEN affects the results of the algorithms and how MCEN shows a good behaviour in terms of results: while MCENTree gives correct results in a statistical range [0,1], CENTree sometimes gives non monotonous and out of range results in binary class classification
Augmenting Translation Lexica by Learning Generalised Translation Patterns
Bilingual Lexicons do improve quality: of parallel corpora alignment, of newly extracted
translation pairs, of Machine Translation, of cross language information retrieval, among
other applications. In this regard, the first problem addressed in this thesis pertains to
the classification of automatically extracted translations from parallel corpora-collections
of sentence pairs that are translations of each other. The second problem is concerned
with machine learning of bilingual morphology with applications in the solution of first
problem and in the generation of Out-Of-Vocabulary translations.
With respect to the problem of translation classification, two separate classifiers for
handling multi-word and word-to-word translations are trained, using previously extracted
and manually classified translation pairs as correct or incorrect. Several insights
are useful for distinguishing the adequate multi-word candidates from those that are
inadequate such as, lack or presence of parallelism, spurious terms at translation ends
such as determiners, co-ordinated conjunctions, properties such as orthographic similarity
between translations, the occurrence and co-occurrence frequency of the translation
pairs. Morphological coverage reflecting stem and suffix agreements are explored as key
features in classifying word-to-word translations. Given that the evaluation of extracted
translation equivalents depends heavily on the human evaluator, incorporation of an
automated filter for appropriate and inappropriate translation pairs prior to human evaluation
contributes to tremendously reduce this work, thereby saving the time involved
and progressively improving alignment and extraction quality. It can also be applied
to filtering of translation tables used for training machine translation engines, and to
detect bad translation choices made by translation engines, thus enabling significative
productivity enhancements in the post-edition process of machine made translations.
An important attribute of the translation lexicon is the coverage it provides. Learning
suffixes and suffixation operations from the lexicon or corpus of a language is an extensively
researched task to tackle out-of-vocabulary terms. However, beyond mere words
or word forms are the translations and their variants, a powerful source of information
for automatic structural analysis, which is explored from the perspective of improving
word-to-word translation coverage and constitutes the second part of this thesis. In this
context, as a phase prior to the suggestion of out-of-vocabulary bilingual lexicon entries,
an approach to automatically induce segmentation and learn bilingual morph-like units by identifying and pairing word stems and suffixes is proposed, using the bilingual
corpus of translations automatically extracted from aligned parallel corpora, manually
validated or automatically classified. Minimally supervised technique is proposed to enable
bilingual morphology learning for language pairs whose bilingual lexicons are highly
defective in what concerns word-to-word translations representing inflection diversity.
Apart from the above mentioned applications in the classification of machine extracted
translations and in the generation of Out-Of-Vocabulary translations, learned bilingual
morph-units may also have a great impact on the establishment of correspondences of
sub-word constituents in the cases of word-to-multi-word and multi-word-to-multi-word
translations and in compression, full text indexing and retrieval applications
Cognitive and behavioral context of pain facilitation : Nocebo conditioning and uncontrollability-induced sensitization
Nocebo effects and uncontrollability are important psychological factors in pain facilitation and play a major role the context of acute and chronic pain. However, the precise mechanisms in both phenomena that lead to pain increase remain understudied. The general aim of the three studies contained in this thesis was to shed light on mechanisms of conditioning-induced nocebo effects and neuronal processes during uncontrollability-induced pain increase. For this purpose, experimental designs were employed that assessed the pain perception and its epiphenomena on multiple response channels (subjective verbal report, behavioral response, autonomic response, neuronal activity).
In the first study, a conditioning procedure was developed without additional verbal suggestions or employment of cues that are prone to induce expectations of pain relief or worsening. The results indicated that conditioning can induce a subjective nocebo effect, even when subjects are contingency unaware (implicit conditioning). The decay of this conditioned response over time was observable in subjective as well as behavioral measures. Neither state nor trait anxiety or measures of anxiety specifically related to pain showed a correlation with this nocebo effect in the subjectively non-painful range.
The second study adapted the conditioning procedure in order to induce nocebo-hyperalgesia. Further, the impact on autonomic measures was explored and relations between the nocebo response and personality traits were investigated. Nocebo-hyperalgesia as indicated by the subjective measure was successfully induced in part of the sample, independent from contingency awareness. Successfully conditioned subjects compared to non-successfully conditioned subjects showed to be habitually less anxious, received higher stimulus intensities despite comparable subjective sensation, and demonstrated increased heart rate and decreased HRV parameters. Motivational style and suggestibility were not related to the nocebo response.
Study three investigated neural correlates of uncontrollability-induced pain increase. During controllable pain trials, subjects showed temporal summation, but adapted during controllable warm trials, as indicated by the behavioral measure. During the uncontrollable pain condition, subjective intensity ratings increased over the course of the individual trials, despite subjects received the identical nociceptive input that they had regulated to feel constant in the controllable condition. The additional pain increase in the pain trials, induced by uncontrollability, was mirrored in increased activation of pain processing brain regions, such as thalamus, insula, SII, and ACC. Importantly, activity in perigenual ACC and PAG drove the uncontrollability-induced pain increase. These results suggest that the loss of control leads to activation of a pro-nociceptive circuitry also assumed to play a role in placebo and nocebo effects that involve the pain modulatory regions PAG and pACC.
In summary, these studies demonstrated a) the powerful impact of psychological factors, such as learning and uncontrollability, on pain perception, and b) proved the benefit of a multidimensional assessment of pain perception and its correlates. These results improve our understanding of pain facilitatory processes and have important implications for therapeutical interventions in pain conditions. They can further promote research in other fields, for example concerning the role of classical conditioning and neural processes in chronic pain
Analysis of Parkinson's Disease Gait using Computational Intelligence
Millions of individuals throughout the world are living with Parkinson’s disease (PD), a neurodegenerative condition whose symptoms are difficult to differentiate from those of other disorders. Freezing of gait (FOG) is one of the signs of Parkinson’s disease that have been utilized as the main diagnostic factor. Bradykinesia, tremors, depression, hallucinations, cognitive impairment, and falls are all common symptoms of Parkinson’s disease (PD). This research uses a dataset that captures data on individuals with PD who suffer from freezing of gait. This dataset includes data for medication in both the “On” and “Off” stages (denoting whether patients have taken their medicines or not). The dataset is comprised of four separate experiments, which are referred to as Voluntary Stop, Timed Up and Go (TUG), Simple Motor Task, and Dual Motor and Cognitive Task. Each of these tests has been carried out over a total of three separate attempts (trials) to verify that they are both reliable and accurate. The dataset was used for four significant challenges. The first challenge is to differentiate between people with Parkinson’s disease and healthy volunteers, and the second task is to evaluate effectiveness of medicines on the patients. The third task is to detect episodes of FOG in each individual, and the last task is to predict the FOG episode at the time of occurrence. For the last task, the author proposed. a new framework to make real-time predictions for detecting FOG, in which the results demonstrated the effectiveness of the approach. It is worth mentioning that techniques from many classifiers have been combined in order to reduce the likelihood of being biased toward a single approach. Multilayer Perceptron, K-Nearest Neighbors, random Forest, and Decision Tree Classifier all produced the best results when applied to the first three tasks with an accuracy of more than 90% amongst the classifiers that were investigated
Comparative analysis of high-sensitivity cardiac troponin I and T for their association with coronary computed tomography-assessed calcium scoring represented by the Agatston score
Background: This study evaluates the association between high-sensitivity cardiac troponin I (hs-cTnI) and T (hs-cTnT) and coronary calcium concentration (CAC) detected by coronary computed tomography (CCT) and evaluated with the Agatston score in patients with suspected coronary artery disease (CAD).
Methods: Patients undergoing CCT during routine clinical care were enrolled prospectively. CCT was indicated for patients with a low to intermediate pretest probability for CAD. Within 24 h of CCT examination, peripheral blood samples were taken to measure cardiac biomarkers hs-cTnI and hs-cTnT.
Results: A total of 76 patients were enrolled including 38% without detectable CAC, 36% with an Agatston score from 1 to 100, 17% from 101 to 400, and 9% with values ≥ 400. hs-cTnI was increasing alongside Agatston score and was able to differentiate between different groups of Agatston scores. Both hs-cTn discriminated values greater than 100 (hs-cTnI, AUC = 0.663; p = 0.032; hs-cTnT, AUC = 0.650; p = 0.048). In univariate and multivariate logistic regression models, hs-cTnT and hs-cTnI were significantly associated with increased Agatston scores. Patients with hs-cTnT ≥ 0.02 µg/l and hs-cTnI ≥ 5.5 ng/l were more likely to reveal values ≥ 400 (hs-cTnT; OR = 13.4; 95% CI 1.545–116.233; p = 0.019; hs-cTnI; OR = 8.8; 95% CI 1.183–65.475; p = 0.034).
Conclusion: The present study shows that the Agatston score was significantly correlated with hs cardiac troponins, both in univariable and multivariable linear regression models. Hs-cTnI is able to discriminate between different Agatston values. The present results might reveal potential cut-off values for hs cardiac troponins regarding different Agatston values.
Trial registration Cardiovascular Imaging and Biomarker Analyses (CIBER), NCT03074253 https://clinicaltrials.gov/ct2/show/record/NCT0307425
- …