31,767 research outputs found

    Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data

    Get PDF
    In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrap632+632+and k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tosim96sim 96%correct classification rates with less than 10% of the original features

    Mining the UKIDSS GPS: star formation and embedded clusters

    Full text link
    Data mining techniques must be developed and applied to analyse the large public data bases containing hundreds to thousands of millions entries. The aim of this study is to develop methods for locating previously unknown stellar clusters from the UKIDSS Galactic Plane Survey catalogue data. The cluster candidates are computationally searched from pre-filtered catalogue data using a method that fits a mixture model of Gaussian densities and background noise using the Expectation Maximization algorithm. The catalogue data contains a significant number of false sources clustered around bright stars. A large fraction of these artefacts were automatically filtered out before or during the cluster search. The UKIDSS data reduction pipeline tends to classify marginally resolved stellar pairs and objects seen against variable surface brightness as extended objects (or "galaxies" in the archive parlance). 10% or 66 x 10^6 of the sources in the UKIDSS GPS catalogue brighter than 17 magnitudes in the K band are classified as "galaxies". Young embedded clusters create variable NIR surface brightness because the gas/dust clouds in which they were formed scatters the light from the cluster members. Such clusters appear therefore as clusters of "galaxies" in the catalogue and can be found using only a subset of the catalogue data. The detected "galaxy clusters" were finally screened visually to eliminate the remaining false detections due to data artefacts. Besides the embedded clusters the search also located locations of non clustered embedded star formation. The search covered an area of 1302 square degrees and 137 previously unknown cluster candidates and 30 previously unknown sites of star formation were found

    Search for unusual objects in the WISE Survey

    Full text link
    Automatic source detection and classification tools based on machine learning (ML) algorithms are growing in popularity due to their efficiency when dealing with large amounts of data simultaneously and their ability to work in multidimensional parameter spaces. In this work, we present a new, automated method of outlier selection based on support vector machine (SVM) algorithm called one-class SVM (OCSVM), which uses the training data as one class to construct a model of 'normality' in order to recognize novel points. We test the performance of OCSVM algorithm on \textit{Wide-field Infrared Survey Explorer (WISE)} data trained on the Sloan Digital Sky Survey (SDSS) sources. Among others, we find 40,000\sim 40,000 sources with abnormal patterns which can be associated with obscured and unobscured active galactic nuclei (AGN) source candidates. We present the preliminary estimation of the clustering properties of these objects and find that the unobscured AGN candidates are preferentially found in less massive dark matter haloes (MDMH1012.4M_{DMH}\sim10^{12.4}) than the obscured candidates (MDMH1013.2M_{DMH}\sim 10^{13.2}). This result contradicts the unification theory of AGN sources and indicates that the obscured and unobscured phases of AGN activity take place in different evolutionary paths defined by different environments.Comment: 4 figures, 6 page
    corecore