7,601 research outputs found

    Missing Value Imputation With Unsupervised Backpropagation

    Full text link
    Many data mining and data analysis techniques operate on dense matrices or complete tables of data. Real-world data sets, however, often contain unknown values. Even many classification algorithms that are designed to operate with missing values still exhibit deteriorated accuracy. One approach to handling missing values is to fill in (impute) the missing values. In this paper, we present a technique for unsupervised learning called Unsupervised Backpropagation (UBP), which trains a multi-layer perceptron to fit to the manifold sampled by a set of observed point-vectors. We evaluate UBP with the task of imputing missing values in datasets, and show that UBP is able to predict missing values with significantly lower sum-squared error than other collaborative filtering and imputation techniques. We also demonstrate with 24 datasets and 9 supervised learning algorithms that classification accuracy is usually higher when randomly-withheld values are imputed using UBP, rather than with other methods

    Bacteria classification using Cyranose 320 electronic nose

    Get PDF
    Background An electronic nose (e-nose), the Cyrano Sciences' Cyranose 320, comprising an array of thirty-two polymer carbon black composite sensors has been used to identify six species of bacteria responsible for eye infections when present at a range of concentrations in saline solutions. Readings were taken from the headspace of the samples by manually introducing the portable e-nose system into a sterile glass containing a fixed volume of bacteria in suspension. Gathered data were a very complex mixture of different chemical compounds. Method Linear Principal Component Analysis (PCA) method was able to classify four classes of bacteria out of six classes though in reality other two classes were not better evident from PCA analysis and we got 74% classification accuracy from PCA. An innovative data clustering approach was investigated for these bacteria data by combining the 3-dimensional scatter plot, Fuzzy C Means (FCM) and Self Organizing Map (SOM) network. Using these three data clustering algorithms simultaneously better 'classification' of six eye bacteria classes were represented. Then three supervised classifiers, namely Multi Layer Perceptron (MLP), Probabilistic Neural network (PNN) and Radial basis function network (RBF), were used to classify the six bacteria classes. Results A [6 × 1] SOM network gave 96% accuracy for bacteria classification which was best accuracy. A comparative evaluation of the classifiers was conducted for this application. The best results suggest that we are able to predict six classes of bacteria with up to 98% accuracy with the application of the RBF network. Conclusion This type of bacteria data analysis and feature extraction is very difficult. But we can conclude that this combined use of three nonlinear methods can solve the feature extraction problem with very complex data and enhance the performance of Cyranose 320

    Patient Specific Congestive Heart Failure Detection From Raw ECG signal

    Full text link
    In this study; in order to diagnose congestive heart failure (CHF) patients, non-linear second-order difference plot (SODP) obtained from raw 256 Hz sampled frequency and windowed record with different time of ECG records are used. All of the data rows are labelled with their belongings to classify much more realistically. SODPs are divided into different radius of quadrant regions and numbers of the points fall in the quadrants are computed in order to extract feature vectors. Fisher's linear discriminant, Naive Bayes, Radial basis function, and artificial neural network are used as classifier. The results are considered in two step validation methods as general k-fold cross-validation and patient based cross-validation. As a result, it is shown that using neural network classifier with features obtained from SODP, the constructed system could distinguish normal and CHF patients with 100% accuracy rate. KeywordsComment: Congestive heart failure, ECG, Second-Order Difference Plot, classification, patient based cross-validatio
    • …
    corecore