27,344 research outputs found

    FEATURE SELECTION FOR THE CLASSIFICATION OF LONGITUDINAL HUMAN AGEING DATA

    Get PDF
    We address the feature selection task in the special context of longitudinal data - where variables are repeatedly measured across different time points. When analysing longitudinal data, a standard feature selection method would typically ignore the temporal nature of the features and treat each feature value at a given time point as a separate feature. That is, a standard algorithm would ignore the important difference between values of the same feature (measuring the same property of an instance) across different time points and values of fundamentally different features (measuring different properties of an instance) at the same time point. This thesis presents two main contributions. The first one is the creation of the longitudinal datasets used in the experiments, including the construction of features capturing longitudinal information for predicting age-related diseases. The datasets were created from data in the English Longitudinal Study of Ageing (ELSA) database. The second contribution consists of proposing four new variants of the Correlation-based Feature Selection (CFS) method for selecting features to be used as input by a classification algorithm. These CFS variants take into account (in different ways) the temporal redundancy associated with variations in the value of a feature across different time points. The results are summarised from two main perspectives. Firstly, in terms of predictive accuracy, one of the proposed CFS variants (called Exh-CFS-Gr - exhaustive search-based CFS per group of temporally redundant features) obtained a statistically significantly better predictive performance than the performance obtained by standard CFS and the baseline approach of no feature selection when using Nai?ve Bayes as the classification algorithm. However, there was no statistically significant difference between the predictive accuracies obtained by J48, a decision tree induction algorithm, for all different variants of CFS (including standard CFS). Secondly, regarding the feature subsets selected by different variants of CFS, the number of features selected by Exh-CFS-Gr was substantially greater than that of all other three CFS variants for all datasets. This helps explaining why this feature selection method obtained the best results in the experiments with Nai?ve Bayes; i.e., it seems that the other CFS variants selected relatively too few features for Nai?ve Bayes. Additionally, the features originally observed in the ELSA database were, in general, selected more often (by all variants of CFS) than the constructed features capturing longitudinal information

    Feature Selection for the Classification of Longitudinal Human Ageing Data

    Get PDF
    We propose a new variant of the Correlation-based Feature Selection (CFS) method for coping with longitudinal data where variables are repeatedly measured across different time points. The proposed CFS variant is evaluated on ten datasets created using data from the English Longitudinal Study of Ageing (ELSA), with different age-related diseases used as the class variables to be predicted. The results show that, overall, the proposed CFS variant leads to better predictive performance than the standard CFS and the baseline approach of no feature selection, when using NaĂŻve Bayes and J48 decision tree induction as classification algorithms (although the difference in performance is very small in the results for J4.8). We also report the most relevant features selected by J48 across the datasets

    A data-driven missing value imputation approach for longitudinal datasets

    Get PDF
    Longitudinal datasets of human ageing studies usually have a high volume of missing data, and one way to handle missing values in a dataset is to replace them with estimations. However, there are many methods to estimate missing values, and no single method is the best for all datasets. In this article, we propose a data-driven missing value imputation approach that performs a feature-wise selection of the best imputation method, using known information in the dataset to rank the five methods we selected, based on their estimation error rates. We evaluated the proposed approach in two sets of experiments: a classifier-independent scenario, where we compared the applicabilities and error rates of each imputation method; and a classifier-dependent scenario, where we compared the predictive accuracy of Random Forest classifiers generated with datasets prepared using each imputation method and a baseline approach of doing no imputation (letting the classification algorithm handle the missing values internally). Based on our results from both sets of experiments, we concluded that the proposed data-driven missing value imputation approach generally resulted in models with more accurate estimations for missing data and better performing classifiers, in longitudinal datasets of human ageing. We also observed that imputation methods devised specifically for longitudinal data had very accurate estimations. This reinforces the idea that using the temporal information intrinsic to longitudinal data is a worthwhile endeavour for machine learning applications, and that can be achieved through the proposed data-driven approach

    Automatic segmentation of MR brain images with a convolutional neural network

    Full text link
    Automatic segmentation in MR brain images is important for quantitative analysis in large-scale studies with images acquired at all ages. This paper presents a method for the automatic segmentation of MR brain images into a number of tissue classes using a convolutional neural network. To ensure that the method obtains accurate segmentation details as well as spatial consistency, the network uses multiple patch sizes and multiple convolution kernel sizes to acquire multi-scale information about each voxel. The method is not dependent on explicit features, but learns to recognise the information that is important for the classification based on training data. The method requires a single anatomical MR image only. The segmentation method is applied to five different data sets: coronal T2-weighted images of preterm infants acquired at 30 weeks postmenstrual age (PMA) and 40 weeks PMA, axial T2- weighted images of preterm infants acquired at 40 weeks PMA, axial T1-weighted images of ageing adults acquired at an average age of 70 years, and T1-weighted images of young adults acquired at an average age of 23 years. The method obtained the following average Dice coefficients over all segmented tissue classes for each data set, respectively: 0.87, 0.82, 0.84, 0.86 and 0.91. The results demonstrate that the method obtains accurate segmentations in all five sets, and hence demonstrates its robustness to differences in age and acquisition protocol

    Age Sensitivity of Face Recognition Algorithms

    Get PDF
    This paper investigates the performance degradation of facial recognition systems due to the influence of age. A comparative analysis of verification performance is conducted for four subspace projection techniques combined with four different distance metrics. The experimental results based on a subset of the MORPH-II database show that the choice of subspace projection technique and associated distance metric can have a significant impact on the performance of the face recognition system for particular age groups

    Developing RNA diagnostics for studying healthy human ageing

    Get PDF
    Developing strategies to cope with increase in the ageing population and age-related chronic diseases is one of the societies biggest challenges. The characteristics of the ageing process shows significant inter-individual variation. Building genomic signatures that could account for variation in health outcomes with age may facilitate early prognosis of individual age-correlated diseases (e.g. cancer, coronary artery diseases and dementia) and help in developing better targeted treatments provided years in advance of acquiring disabling symptoms for these diseases. The aim of this thesis was to explore methods for diagnosing molecular features of human ageing. In particular, we utilise multi-platform transcriptomics, independent clinical data and classification methods to evaluate which human tissues demonstrate a reproducible molecular signature for age and which clinical phenotypes correlated with these new RNA biomarkers. [Continues.

    Identifying Trippers and Non-Trippers Based on Knee Kinematics During Obstacle-Free Walking

    Get PDF
    Trips are a major cause of falls. Sagittal-plane kinematics affect clearance between the foot and obstacles, however, it is unclear which kinematic measures during obstacle-free walking are associated with avoiding a trip when encountering an obstacle. The purpose of this study was to determine kinematic factors during obstacle-free walking that are related to obstacle avoidance ability. It was expected that successful obstacle avoidance would be associated with greater peak flexion/dorsiflexion and range of motion (ROM), and differences in timing of peak flexion/dorsiflexion during swing of obstacle-free walking for the hip, knee and ankle. Three-dimensional kinematics were recorded as 35 participants (young adults age 18–45 (N = 10), older adults age 65+ without a history of falls (N = 10), older adults age 65+ who had fallen in the last six months (N = 10), and individuals who had experienced a stroke more than six months earlier (N = 5)) walked on a treadmill, under obstacle-free walking conditions with kinematic features calculated for each stride. A separate obstacle avoidance task identified trippers (multiple obstacle contact) and non-trippers. Linear discriminant analysis with sequential feature selection classified trippers and non-trippers based on kinematics during obstacle-free walking. Differences in classification performance and selected features (knee ROM and timing of peak knee flexion during swing) were evaluated between trippers and non-trippers. Non-trippers had greater knee ROM (P = .001). There was no significant difference in classification performance (P = .193). Individuals with reduced knee ROM during obstacle-free walking may have greater difficulty avoiding obstacles
    • …
    corecore