2,648 research outputs found

    Online learning and detection of faces with low human supervision

    Get PDF
    The final publication is available at link.springer.comWe present an efficient,online,and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (Random Ferns) to compute progressively an efficient and discriminative classifier. On the other hand, WiLFs use an interactive human-machine approach that combines two complementary learning strategies to reduce considerably the degree of human supervision during learning. While the first strategy corresponds to query-by-boosting active learning, that requests human assistance over difficult samples in function of the classifier confidence, the second strategy refers to a memory-based learning which uses ¿ Exemplar-based Nearest Neighbors (¿ENN) to assist automatically the classifier. A pre-trained Convolutional Neural Network (CNN) is used to perform ¿ENN with high-level feature descriptors. The proposed approach is therefore fast (WilFs run in 1 FPS using a code not fully optimized), accurate (we obtain detection rates over 82% in complex datasets), and labor-saving (human assistance percentages of less than 20%). As a byproduct, we demonstrate that WiLFs also perform semi-automatic annotation during learning, as while the classifier is being computed, WiLFs are discovering faces instances in input images which are used subsequently for training online the classifier. The advantages of our approach are demonstrated in synthetic and publicly available databases, showing comparable detection rates as offline approaches that require larger amounts of handmade training data.Peer ReviewedPostprint (author's final draft

    Longitudinal performance analysis of machine learning based Android malware detectors

    Get PDF
    This paper presents a longitudinal study of the performance of machine learning classifiers for Android malware detection. The study is undertaken using features extracted from Android applications first seen between 2012 and 2016. The aim is to investigate the extent of performance decay over time for various machine learning classifiers trained with static features extracted from date-labelled benign and malware application sets. Using date-labelled apps allows for true mimicking of zero-day testing, thus providing a more realistic view of performance than the conventional methods of evaluation that do not take date of appearance into account. In this study, all the investigated machine learning classifiers showed progressive diminishing performance when tested on sets of samples from a later time period. Overall, it was found that false positive rate (misclassifying benign samples as malicious) increased more substantially compared to the fall in True Positive rate (correct classification of malicious apps) when older models were tested on newer app samples

    Asymmetric Pruning for Learning Cascade Detectors

    Full text link
    Cascade classifiers are one of the most important contributions to real-time object detection. Nonetheless, there are many challenging problems arising in training cascade detectors. One common issue is that the node classifier is trained with a symmetric classifier. Having a low misclassification error rate does not guarantee an optimal node learning goal in cascade classifiers, i.e., an extremely high detection rate with a moderate false positive rate. In this work, we present a new approach to train an effective node classifier in a cascade detector. The algorithm is based on two key observations: 1) Redundant weak classifiers can be safely discarded; 2) The final detector should satisfy the asymmetric learning objective of the cascade architecture. To achieve this, we separate the classifier training into two steps: finding a pool of discriminative weak classifiers/features and training the final classifier by pruning weak classifiers which contribute little to the asymmetric learning criterion (asymmetric classifier construction). Our model reduction approach helps accelerate the learning time while achieving the pre-determined learning objective. Experimental results on both face and car data sets verify the effectiveness of the proposed algorithm. On the FDDB face data sets, our approach achieves the state-of-the-art performance, which demonstrates the advantage of our approach.Comment: 14 page

    Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks

    Full text link
    X-ray diffraction (XRD) data acquisition and analysis is among the most time-consuming steps in the development cycle of novel thin-film materials. We propose a machine-learning-enabled approach to predict crystallographic dimensionality and space group from a limited number of thin-film XRD patterns. We overcome the scarce-data problem intrinsic to novel materials development by coupling a supervised machine learning approach with a model agnostic, physics-informed data augmentation strategy using simulated data from the Inorganic Crystal Structure Database (ICSD) and experimental data. As a test case, 115 thin-film metal halides spanning 3 dimensionalities and 7 space-groups are synthesized and classified. After testing various algorithms, we develop and implement an all convolutional neural network, with cross validated accuracies for dimensionality and space-group classification of 93% and 89%, respectively. We propose average class activation maps, computed from a global average pooling layer, to allow high model interpretability by human experimentalists, elucidating the root causes of misclassification. Finally, we systematically evaluate the maximum XRD pattern step size (data acquisition rate) before loss of predictive accuracy occurs, and determine it to be 0.16{\deg}, which enables an XRD pattern to be obtained and classified in 5.5 minutes or less.Comment: Accepted with minor revisions in npj Computational Materials, Presented in NIPS 2018 Workshop: Machine Learning for Molecules and Material

    A discrete contextual stochastic model for the off-line recognition of handwritten Chinese characters

    Get PDF
    We study a discrete contextual stochastic (CS) model for complex and variant patterns like handwritten Chinese characters. Three fundamental problems of using CS models for character recognition are discussed, and several practical techniques for solving these problems are investigated. A formulation for discriminative training of CS model parameters is also introduced and its practical usage investigated. To illustrate the characteristics of the various algorithms, comparative experiments are performed on a recognition task with a vocabulary consisting of 50 pairs of highly similar handwritten Chinese characters. The experimental results confirm the effectiveness of the discriminative training for improving recognition performance.published_or_final_versio
    • …
    corecore