1,833 research outputs found

    On the automated extraction of regression knowledge from databases

    Get PDF
    The advent of inexpensive, powerful computing systems, together with the increasing amount of available data, conforms one of the greatest challenges for next-century information science. Since it is apparent that much future analysis will be done automatically, a good deal of attention has been paid recently to the implementation of ideas and/or the adaptation of systems originally developed in machine learning and other computer science areas. This interest seems to stem from both the suspicion that traditional techniques are not well-suited for large-scale automation and the success of new algorithmic concepts in difficult optimization problems. In this paper, I discuss a number of issues concerning the automated extraction of regression knowledge from databases. By regression knowledge is meant quantitative knowledge about the relationship between a vector of predictors or independent variables (x) and a scalar response or dependent variable (y). A number of difficulties found in some well-known tools are pointed out, and a flexible framework avoiding many such difficulties is described and advocated. Basic features of a new tool pursuing this direction are reviewed

    Machine Learning in Falls Prediction; A cognition-based predictor of falls for the acute neurological in-patient population

    Get PDF
    Background Information: Falls are associated with high direct and indirect costs, and significant morbidity and mortality for patients. Pathological falls are usually a result of a compromised motor system, and/or cognition. Very little research has been conducted on predicting falls based on this premise. Aims: To demonstrate that cognitive and motor tests can be used to create a robust predictive tool for falls. Methods: Three tests of attention and executive function (Stroop, Trail Making, and Semantic Fluency), a measure of physical function (Walk-12), a series of questions (concerning recent falls, surgery and physical function) and demographic information were collected from a cohort of 323 patients at a tertiary neurological center. The principal outcome was a fall during the in-patient stay (n = 54). Data-driven, predictive modelling was employed to identify the statistical modelling strategies which are most accurate in predicting falls, and which yield the most parsimonious models of clinical relevance. Results: The Trail test was identified as the best predictor of falls. Moreover, addition of any others variables, to the results of the Trail test did not improve the prediction (Wilcoxon signed-rank p < .001). The best statistical strategy for predicting falls was the random forest (Wilcoxon signed-rank p < .001), based solely on results of the Trail test. Tuning of the model results in the following optimized values: 68% (+- 7.7) sensitivity, 90% (+- 2.3) specificity, with a positive predictive value of 60%, when the relevant data is available. Conclusion: Predictive modelling has identified a simple yet powerful machine learning prediction strategy based on a single clinical test, the Trail test. Predictive evaluation shows this strategy to be robust, suggesting predictive modelling and machine learning as the standard for future predictive tools

    Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease

    Get PDF
    Vocal performance degradation is a common symptom for the vast majority of Parkinson's disease (PD) subjects, who typically follow personalized one-to-one periodic rehabilitation meetings with speech experts over a long-term period. Recently, a novel computer program called Lee Silverman voice treatment (LSVT) Companion was developed to allow PD subjects to independently progress through a rehabilitative treatment session. This study is part of the assessment of the LSVT Companion, aiming to investigate the potential of using sustained vowel phonations towards objectively and automatically replicating the speech experts' assessments of PD subjects' voices as “acceptable” (a clinician would allow persisting during in-person rehabilitation treatment) or “unacceptable” (a clinician would not allow persisting during in-person rehabilitation treatment). We characterize each of the 156 sustained vowel /a/ phonations with 309 dysphonia measures, select a parsimonious subset using a robust feature selection algorithm, and automatically distinguish the two cohorts (acceptable versus unacceptable) with about 90% overall accuracy. Moreover, we illustrate the potential of the proposed methodology as a probabilistic decision support tool to speech experts to assess a phonation as “acceptable” or “unacceptable.” We envisage the findings of this study being a first step towards improving the effectiveness of an automated rehabilitative speech assessment tool

    Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer

    Full text link
    Quantitative extraction of high-dimensional mineable data from medical images is a process known as radiomics. Radiomics is foreseen as an essential prognostic tool for cancer risk assessment and the quantification of intratumoural heterogeneity. In this work, 1615 radiomic features (quantifying tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET and CT images of 300 patients from four different cohorts were analyzed for the risk assessment of locoregional recurrences (LR) and distant metastases (DM) in head-and-neck cancer. Prediction models combining radiomic and clinical variables were constructed via random forests and imbalance-adjustment strategies using two of the four cohorts. Independent validation of the prediction and prognostic performance of the models was carried out on the other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88). Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the potential of radiomics for assessing the risk of specific tumour outcomes using multiple stratification groups. This could have important clinical impact, notably by allowing for a better personalization of chemo-radiation treatments for head-and-neck cancer patients from different risk groups.Comment: (1) Paper: 33 pages, 4 figures, 1 table; (2) SUPP info: 41 pages, 7 figures, 8 table

    Assessment of multi-temporal, multi-sensor radar and ancillary spatial data for grasslands monitoring in Ireland using machine learning approaches

    Get PDF
    Accurate inventories of grasslands are important for studies of carbon dynamics, biodiversity conservation and agricultural management. For regions with persistent cloud cover the use of multi-temporal synthetic aperture radar (SAR) data provides an attractive solution for generating up-to-date inventories of grasslands. This is even more appealing considering the data that will be available from upcoming missions such as Sentinel-1 and ALOS-2. In this study, the performance of three machine learning algorithms; Random Forests (RF), Support Vector Machines (SVM) and the relatively underused Extremely Randomised Trees (ERT) is evaluated for discriminating between grassland types over two large heterogeneous areas of Ireland using multi-temporal, multi-sensor radar and ancillary spatial datasets. A detailed accuracy assessment shows the efficacy of the three algorithms to classify different types of grasslands. Overall accuracies ≥ 88.7% (with kappa coefficient of 0.87) were achieved for the single frequency classifications and maximum accuracies of 97.9% (kappa coefficient of 0.98) for the combined frequency classifications. For most datasets, the ERT classifier outperforms SVM and RF

    Sentiment Analysis for Words and Fiction Characters From The Perspective of Computational (Neuro-)Poetics

    Get PDF
    Two computational studies provide different sentiment analyses for text segments (e.g., ‘fearful’ passages) and figures (e.g., ‘Voldemort’) from the Harry Potter books (Rowling, 1997 - 2007) based on a novel simple tool called SentiArt. The tool uses vector space models together with theory-guided, empirically validated label lists to compute the valence of each word in a text by locating its position in a 2d emotion potential space spanned by the > 2 million words of the vector space model. After testing the tool’s accuracy with empirical data from a neurocognitive study, it was applied to compute emotional figure profiles and personality figure profiles (inspired by the so-called ‚big five’ personality theory) for main characters from the book series. The results of comparative analyses using different machine-learning classifiers (e.g., AdaBoost, Neural Net) show that SentiArt performs very well in predicting the emotion potential of text passages. It also produces plausible predictions regarding the emotional and personality profile of fiction characters which are correctly identified on the basis of eight character features, and it achieves a good cross-validation accuracy in classifying 100 figures into ‘good’ vs. ‘bad’ ones. The results are discussed with regard to potential applications of SentiArt in digital literary, applied reading and neurocognitive poetics studies such as the quantification of the hybrid hero potential of figures
    corecore