1,833 research outputs found
On the automated extraction of regression knowledge from databases
The advent of inexpensive, powerful computing systems, together with the increasing amount of available data, conforms one of the greatest challenges for next-century information science. Since it is apparent that much future analysis will be done automatically, a good deal of attention has been paid recently to the implementation of ideas and/or the adaptation of systems originally developed in machine learning and other computer science areas. This interest seems to stem from both the suspicion that traditional techniques are not well-suited for large-scale automation and the success of new algorithmic concepts in difficult optimization problems. In this paper, I discuss a number of issues concerning the automated extraction of regression knowledge from databases. By regression knowledge is meant quantitative knowledge about the relationship between a vector of predictors or independent variables (x) and a scalar response or dependent variable (y). A number of difficulties found in some well-known tools are pointed out, and a flexible framework avoiding many such difficulties is described and advocated. Basic features of a new tool pursuing this direction are reviewed
Recommended from our members
Multimodal MRI-based Imputation of the Aβ+ in Early Mild Cognitive Impairment.
ObjectiveTo identify brain atrophy from structural-MRI and cerebral blood flow(CBF) patterns from arterial spin labeling perfusion-MRI that are best predictors of the Aβ-burden, measured as composite 18F-AV45-PET uptake, in individuals with early mild cognitive impairment(MCI). Furthermore, to assess the relative importance of imaging modalities in classification of Aβ+/Aβ- early mild cognitive impairment.MethodsSixty-seven ADNI-GO/2 participants with early-MCI were included. Voxel-wise anatomical shape variation measures were computed by estimating the initial diffeomorphic mapping momenta from an unbiased control template. CBF measures normalized to average motor cortex CBF were mapped onto the template space. Using partial least squares regression, we identified the structural and CBF signatures of Aβ after accounting for normal cofounding effects of age, sex, and education.Results18F-AV45-positive early-MCIs could be identified with 83% classification accuracy, 87% positive predictive value, and 84% negative predictive value by multidisciplinary classifiers combining demographics data, ApoE ε4-genotype, and a multimodal MRI-based Aβ score.InterpretationMultimodal-MRI can be used to predict the amyloid status of early-MCI individuals. MRI is a very attractive candidate for the identification of inexpensive and non-invasive surrogate biomarkers of Aβ deposition. Our approach is expected to have value for the identification of individuals likely to be Aβ+ in circumstances where cost or logistical problems prevent Aβ detection using cerebrospinal fluid analysis or Aβ-PET. This can also be used in clinical settings and clinical trials, aiding subject recruitment and evaluation of treatment efficacy. Imputation of the Aβ-positivity status could also complement Aβ-PET by identifying individuals who would benefit the most from this assessment
Machine Learning in Falls Prediction; A cognition-based predictor of falls for the acute neurological in-patient population
Background Information: Falls are associated with high direct and indirect
costs, and significant morbidity and mortality for patients. Pathological falls
are usually a result of a compromised motor system, and/or cognition. Very
little research has been conducted on predicting falls based on this premise.
Aims: To demonstrate that cognitive and motor tests can be used to create a
robust predictive tool for falls.
Methods: Three tests of attention and executive function (Stroop, Trail
Making, and Semantic Fluency), a measure of physical function (Walk-12), a
series of questions (concerning recent falls, surgery and physical function)
and demographic information were collected from a cohort of 323 patients at a
tertiary neurological center. The principal outcome was a fall during the
in-patient stay (n = 54). Data-driven, predictive modelling was employed to
identify the statistical modelling strategies which are most accurate in
predicting falls, and which yield the most parsimonious models of clinical
relevance.
Results: The Trail test was identified as the best predictor of falls.
Moreover, addition of any others variables, to the results of the Trail test
did not improve the prediction (Wilcoxon signed-rank p < .001). The best
statistical strategy for predicting falls was the random forest (Wilcoxon
signed-rank p < .001), based solely on results of the Trail test. Tuning of the
model results in the following optimized values: 68% (+- 7.7) sensitivity, 90%
(+- 2.3) specificity, with a positive predictive value of 60%, when the
relevant data is available.
Conclusion: Predictive modelling has identified a simple yet powerful machine
learning prediction strategy based on a single clinical test, the Trail test.
Predictive evaluation shows this strategy to be robust, suggesting predictive
modelling and machine learning as the standard for future predictive tools
Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease
Vocal performance degradation is a common symptom for the vast majority of Parkinson's disease (PD) subjects, who typically follow personalized one-to-one periodic rehabilitation meetings with speech experts over a long-term period. Recently, a novel computer program called Lee Silverman voice treatment (LSVT) Companion was developed to allow PD subjects to independently progress through a rehabilitative treatment session. This study is part of the assessment of the LSVT Companion, aiming to investigate the potential of using sustained vowel phonations towards objectively and automatically replicating the speech experts' assessments of PD subjects' voices as “acceptable” (a clinician would allow persisting during in-person rehabilitation treatment) or “unacceptable” (a clinician would not allow persisting during in-person rehabilitation treatment). We characterize each of the 156 sustained vowel /a/ phonations with 309 dysphonia measures, select a parsimonious subset using a robust feature selection algorithm, and automatically distinguish the two cohorts (acceptable versus unacceptable) with about 90% overall accuracy. Moreover, we illustrate the potential of the proposed methodology as a probabilistic decision support tool to speech experts to assess a phonation as “acceptable” or “unacceptable.” We envisage the findings of this study being a first step towards improving the effectiveness of an automated rehabilitative speech assessment tool
Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer
Quantitative extraction of high-dimensional mineable data from medical images
is a process known as radiomics. Radiomics is foreseen as an essential
prognostic tool for cancer risk assessment and the quantification of
intratumoural heterogeneity. In this work, 1615 radiomic features (quantifying
tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET
and CT images of 300 patients from four different cohorts were analyzed for the
risk assessment of locoregional recurrences (LR) and distant metastases (DM) in
head-and-neck cancer. Prediction models combining radiomic and clinical
variables were constructed via random forests and imbalance-adjustment
strategies using two of the four cohorts. Independent validation of the
prediction and prognostic performance of the models was carried out on the
other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88).
Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the
potential of radiomics for assessing the risk of specific tumour outcomes using
multiple stratification groups. This could have important clinical impact,
notably by allowing for a better personalization of chemo-radiation treatments
for head-and-neck cancer patients from different risk groups.Comment: (1) Paper: 33 pages, 4 figures, 1 table; (2) SUPP info: 41 pages, 7
figures, 8 table
Assessment of multi-temporal, multi-sensor radar and ancillary spatial data for grasslands monitoring in Ireland using machine learning approaches
Accurate inventories of grasslands are important for studies of carbon dynamics, biodiversity conservation and agricultural management. For regions with persistent cloud cover the use of multi-temporal synthetic aperture radar (SAR) data provides an attractive solution for generating up-to-date inventories of grasslands. This is even more appealing considering the data that will be available from upcoming missions such as Sentinel-1 and ALOS-2. In this study, the performance of three machine learning algorithms; Random Forests (RF), Support Vector Machines (SVM) and the relatively underused Extremely Randomised Trees (ERT) is evaluated for discriminating between grassland types over two large heterogeneous areas of Ireland using multi-temporal, multi-sensor radar and ancillary spatial datasets. A detailed accuracy assessment shows the efficacy of the three algorithms to classify different types of grasslands. Overall accuracies ≥ 88.7% (with kappa coefficient of 0.87) were achieved for the single frequency classifications and maximum accuracies of 97.9% (kappa coefficient of 0.98) for the combined frequency classifications. For most datasets, the ERT classifier outperforms SVM and RF
Sentiment Analysis for Words and Fiction Characters From The Perspective of Computational (Neuro-)Poetics
Two computational studies provide different sentiment analyses for text segments (e.g., ‘fearful’ passages) and figures (e.g., ‘Voldemort’) from the Harry Potter books (Rowling, 1997 - 2007) based on a novel simple tool called SentiArt. The tool uses vector space models together with theory-guided, empirically validated label lists to compute the valence of each word in a text by locating its position in a 2d emotion potential space spanned by the > 2 million words of the vector space model. After testing the tool’s accuracy with empirical data from a neurocognitive study, it was applied to compute emotional figure profiles and personality figure profiles (inspired by the so-called ‚big five’ personality theory) for main characters from the book series. The results of comparative analyses using different machine-learning classifiers (e.g., AdaBoost, Neural Net) show that SentiArt performs very well in predicting the emotion potential of text passages. It also produces plausible predictions regarding the emotional and personality profile of fiction characters which are correctly identified on the basis of eight character features, and it achieves a good cross-validation accuracy in classifying 100 figures into ‘good’ vs. ‘bad’ ones. The results are discussed with regard to potential applications of SentiArt in digital literary, applied reading and neurocognitive poetics studies such as the quantification of the hybrid hero potential of figures
- …