2,189 research outputs found

    Cluster validity in clustering methods

    Get PDF

    Resampling Methods for Unsupervised Learning from Sample Data

    Get PDF

    An overview of clustering methods with guidelines for application in mental health research

    Get PDF
    Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and librarie

    Identification of patient classes in low back pain data using crisp and fuzzy clustering methods

    Get PDF
    We performed a cluster analysis of the low back pain dataset in the framework of the IFCS-2017 data challenge. Because the original data contained missing values, the first part of our analysis concerned the imputation of missing values using the Fully Conditional Specification model. The Local Outlier Factor method was then used to detect and eliminate the outliers. After the data normalization, we removed highly correlated variables from the transformed dataset and carried out k-means clustering of the remaining variables based on their correlations, i.e., the variables with the highest mutual correlations were assigned to the same cluster. Once the variables were assigned to different clusters, one representative per cluster, i.e., the variable with the highest contribution score at the first principal component, was selected. Among the 13 selected variables, there are representatives of each of the 6 variable domains (contextual factor, participation, pain, psychological, activity and physical impairment), specified as important in the paper by Nielsen et al. (2016). Different clustering methods, including DAPC, k-means and k-medoids, were then carried out to cluster the reduced low back pain data. Consensus solutions, both crisp and fuzzy, were calculated using the GV3 method. The obtained crisp consensus clustering, including 5 classes, was described in detail and compared to the meta-data annotation

    A novel framework to elucidate core classes in a dataset

    Get PDF
    In this paper we present an original framework to extract representative groups from a dataset, and we validate it over a novel case study. The framework specifies the application of different clustering algorithms, then several statistical and visualisation techniques are used to characterise the results, and core classes are defined by consensus clustering. Classes may be verified using supervised classification algorithms to obtain a set of rules which may be useful for new data points in the future. This framework is validated over a novel set of histone markers for breast cancer patients. From a technical perspective, the resultant classes are well separated and characterised by low, medium and high levels of biological markers. Clinically, the groups appear to distinguish patients with poor overall survival from those with low grading score and better survival. Overall, this framework offers a promising methodology for elucidating core consensus groups from data

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Multiple Cue Based Vehicle Detection and Tracking for Road Safety

    Full text link
    With the rise in accident related fatalities on roads, the researchers around the world are looking for solutions including integrating intelligence to vehicles. One cruicial aspects of it is the robust detection and tracking of other vehicles in the visinity. In this paper, we have proposed a probabilistic way of incorporation of several visual cues in vehicle detection and a particle filter based tracking strategy. Visual cues used are, lane markings, symmetry, entropy and shadows. Combination of visual cues provided us with robust results when compared with their individual counterparts. The definition of a region of interest lowers the computational requirements with improved robustness. Experimental results of the algorithm in Sydney urban areas are presente
    • …
    corecore