832 research outputs found

    Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    Get PDF
    We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes

    Combining novelty detectors to improve accelerometer-based fall detection

    Get PDF
    Research on body-worn sensors has shown how they can be used for the detection of falls in the elderly, which is a relevant health problem. However, most systems are trained with simulated falls, which differ from those of the target population. In this paper, we tackle the problem of fall detection using a combination of novelty detectors. A novelty detector can be trained only with activities of daily life (ADL), which are true movements recorded in real life. In addition, they allow adapting the system to new users, by recording new movements and retraining the system. The combination of several detectors and features enhances performance. The proposed approach has been compared with a traditional supervised algorithm, a support vector machine, which is trained with both falls and ADL. The combination of novelty detectors shows better performance in a typical cross-validation test and in an experiment that mimics the effect of personalizing the classifiers. The results indicate that it is possible to build a reliable fall detector based only on ADL

    Automated reliability assessment for spectroscopic redshift measurements

    Get PDF
    We present a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function (PDF). We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process, and produce a redshift posterior PDF that will be the starting-point for ML algorithms to provide an automated assessment of a redshift reliability. As a use case, public data from the VIMOS VLT Deep Survey is exploited to present and test this new methodology. We first tried to reproduce the existing reliability flags using supervised classification to describe different types of redshift PDFs, but due to the subjective definition of these flags, soon opted for a new homogeneous partitioning of the data into distinct clusters via unsupervised classification. After assessing the accuracy of the new clusters via resubstitution and test predictions, unlabelled data from preliminary mock simulations for the Euclid space mission are projected into this mapping to predict their redshift reliability labels.Comment: Submitted on 02 June 2017 (v1). Revised on 08 September 2017 (v2). Latest version 28 September 2017 (this version v3


    Get PDF
    Since the advent of Industry 4. 0 significant research has been conducted to apply machine learning to the vast array of Internet of Things (IoT) data produced by Industrial Machines. One such topic is to Predictive Maintenance. Unlike some other machine learning domains such as NLP and computer vision, Predictive Maintenance is a relatively new area of focus. Most of the published work demonstrates the effectiveness of supervised classification for predictive maintenance. Some of the challenges highlighted in the literature are the cost and difficulty of obtaining labelled samples for training. Novelty detection is a branch of machine learning that after being trained on normal operations detects if new data comes from the same process or is different, eliminating the requirement to label data points. This thesis applies novelty detection to both a public data set and one that was specifically collected to demonstrate a its application to predictive maintenance. The Local Optimization Factor showed better performance than a One-Class SVM on the public data. It was then applied to data from a 3-D printer and was able to detect faults it had not been trained on showing a slight lift from a random classifier

    Random Subspace Learning on Outlier Detection and Classification with Minimum Covariance Determinant Estimator

    Get PDF
    The questions brought by high dimensional data is interesting and challenging. Our study is targeting on the particular type of data in this situation that namely “large p, small n”. Since the dimensionality is massively larger than the number of observations in the data, any measurement of covariance and its inverse will be miserably affected. The definition of high dimension in statistics has been changed throughout decades. Modern datasets with over thousands of dimensions are demanding the ability to gain deeper understanding but hindered by the curse of dimensionality. We decide to review and explore further to negotiate with the curse and extend previous studies to pave a new way for estimating robustness then apply it to outlier detection and classification. We explored the random subspace learning and expand other classification and outlier detection algorithms to adapt its framework. Our proposed methods can handle both high-dimension low-sample size and traditional low-dimensional high-sample size datasets. Essentially, we avoid the computational bottleneck of techniques like Minimum Covariance Determinant (MCD) by computing the needed determinants and associated measures in much lower dimensional subspaces. Both theoretical and computational development of our approach reveal that it is computationally more efficient than the regularized methods in high-dimensional low-sample size, and often competes favorably with existing methods as far as the percentage of correct outlier detection are concerned

    IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection

    Get PDF
    This article belongs to the Special Issue Sensor Network Technologies and Applications with Wireless Sensor Devices[Abstract] With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.This project was funded by the Accreditation, Structuring, and Improvement of Consolidated Research Units and Singular Centers (ED431G/01), funded by Vocational Training of the Xunta de Galicia endowed with EU FEDER funds and Spanish Ministry of Science and Innovation, via the project PID2019-111388GB-I00Xunta de Galicia; ED431G/0
    • …