15,039 research outputs found

    Unsupervised Feature Selection with Adaptive Structure Learning

    Full text link
    The problem of feature selection has raised considerable interests in the past decade. Traditional unsupervised methods select the features which can faithfully preserve the intrinsic structures of data, where the intrinsic structures are estimated using all the input features of data. However, the estimated intrinsic structures are unreliable/inaccurate when the redundant and noisy features are not removed. Therefore, we face a dilemma here: one need the true structures of data to identify the informative features, and one need the informative features to accurately estimate the true structures of data. To address this, we propose a unified learning framework which performs structure learning and feature selection simultaneously. The structures are adaptively learned from the results of feature selection, and the informative features are reselected to preserve the refined structures of data. By leveraging the interactions between these two essential tasks, we are able to capture accurate structures and select more informative features. Experimental results on many benchmark data sets demonstrate that the proposed method outperforms many state of the art unsupervised feature selection methods

    Machine learning for crystal identification and discovery

    Full text link
    As computers get faster, researchers -- not hardware or algorithms -- become the bottleneck in scientific discovery. Computational study of colloidal self-assembly is one area that is keenly affected: even after computers generate massive amounts of raw data, performing an exhaustive search to determine what (if any) ordered structures occur in a large parameter space of many simulations can be excruciating. We demonstrate how machine learning can be applied to discover interesting areas of parameter space in colloidal self assembly. We create numerical fingerprints -- inspired by bond orientational order diagrams -- of structures found in self-assembly studies and use these descriptors to both find interesting regions in a phase diagram and identify characteristic local environments in simulations in an automated manner for simple and complex crystal structures. Utilizing these methods allows analysis methods to keep up with the data generation ability of modern high-throughput computing environments.Comment: Fixed typo, added missing acknowledgment, added supplementary informatio

    Identification of body fat tissues in MRI data

    Get PDF
    In recent years non-invasive medical diagnostic techniques have been used widely in medical investigations. Among the various imaging modalities available, Magnetic Resonance Imaging is very attractive as it produces multi-slice images where the contrast between various types of body tissues such as muscle, ligaments and fat is well defined. The aim of this paper is to describe the implementation of an unsupervised image analysis algorithm able to identify the body fat tissues from a sequence of MR images encoded in DICOM format. The developed algorithm consists of three main steps. The first step pre-processes the MR images in order to reduce the level of noise. The second step extracts the image areas representing fat tissues by using an unsupervised clustering algorithm. Finally, image refinements are applied to reclassify the pixels adjacent to the initial fat estimate and to eliminate outliers. The experimental data indicates that the proposed implementation returns accurate results and furthermore is robust to noise and to greyscale in-homogeneity

    Practical Attacks Against Graph-based Clustering

    Full text link
    Graph modeling allows numerous security problems to be tackled in a general way, however, little work has been done to understand their ability to withstand adversarial attacks. We design and evaluate two novel graph attacks against a state-of-the-art network-level, graph-based detection system. Our work highlights areas in adversarial machine learning that have not yet been addressed, specifically: graph-based clustering techniques, and a global feature space where realistic attackers without perfect knowledge must be accounted for (by the defenders) in order to be practical. Even though less informed attackers can evade graph clustering with low cost, we show that some practical defenses are possible.Comment: ACM CCS 201
    • 

    corecore