5 research outputs found

    Limits on transfer learning from photographic image data to X-ray threat detection

    Get PDF
    BACKGROUND: X-ray imaging is a crucial and ubiquitous tool for detecting threats to transport security, but interpretation of the images presents a logistical bottleneck. Recent advances in Deep Learning image classification offer hope of improving throughput through automation. However, Deep Learning methods require large quantities of labelled training data. While photographic data is cheap and plentiful, comparable training sets are seldom available for the X-ray domain. OBJECTIVE: To determine whether and to what extent it is feasible to exploit the availability of photo data to supplement the training of X-ray threat detectors. METHODS: A new dataset was collected, consisting of 1901 matched pairs of photo & X-ray images of 501 common objects. Of these, 258 pairs were of 69 objects considered threats in the context of aviation. This data was used to test a variety of transfer learning approaches. A simple model of threat cue availability was developed to understand the limits of this transferability. RESULTS: Appearance features learned from photos provide a useful basis for training classifiers. Some transfer from the photo to the X-ray domain is possible as ∼40% of danger cues are shared between the modalities, but the effectiveness of this transfer is limited since ∼60% of cues are not. CONCLUSIONS: Transfer learning is beneficial when X-ray data is very scarce—of the order of tens of training images in our experiments—but provides no significant benefit when hundreds or thousands of X-ray images are available

    A Pipeline for Classifying Deleterious Coding Mutations in Agricultural Plants

    Get PDF
    The impact of deleterious variation on both plant fitness and crop productivity is not completely understood and is a hot topic of debates. The deleterious mutations in plants have been solely predicted using sequence conservation methods rather than function-based classifiers due to lack of well-annotated mutational datasets in these organisms. Here, we developed a machine learning classifier based on a dataset of deleterious and neutral mutations in Arabidopsis thaliana by extracting 18 informative features that discriminate deleterious mutations from neutral, including 9 novel features not used in previous studies. We examined linear SVM, Gaussian SVM, and Random Forest classifiers, with the latter performing best. Random Forest classifiers exhibited a markedly higher accuracy than the popular PolyPhen-2 tool in the Arabidopsis dataset. Additionally, we tested whether the Random Forest, trained on the Arabidopsis dataset, accurately predicts deleterious mutations in Orýza sativa and Pisum sativum and observed satisfactory levels of performance accuracy (87% and 93%, respectively) higher than obtained by the PolyPhen-2. Application of Transfer learning in classifiers did not improve their performance. To additionally test the performance of the Random Forest classifier across different angiosperm species, we applied it to annotate deleterious mutations in Cicer arietinum and validated them using population frequency data. Overall, we devised a classifier with the potential to improve the annotation of putative functional mutations in QTL and GWAS hit regions, as well as for the evolutionary analysis of proliferation of deleterious mutations during plant domestication; thus optimizing breeding improvement and development of new cultivars

    On What to Learn: Train or Adapt a Deeply Learned Profile?

    Get PDF
    In recent years, many papers have shown that deep learning can be beneficial for profiled side-channel analysis. However, in order to obtain good performances with deep learning, an attacker needs a lot of data for training. The training data should be as similar as possible to the data that will be obtained during the attack, a condition that may not be easily met in real-world scenarios. It is thus of interest to analyse different scenarios where the attack makes use of ``imperfect training data. The typical situation in side-channel is that the attacker has access to an unlabelled dataset of measurements from the target device (obtained with the key he actually wants to recover) and, depending on the context, he may also take profit of a labelled dataset (say profiling data) obtained on the same device (with known or chosen key(s)). In this paper, we extend the attacker models and investigate the situation where an attacker additionally has access to a neural network that has been pre-trained on some other dataset not fully corresponding to the attack one. The attacker can then either directly use the pre-trained network to attack, or if profiling data is available, train a new network, or adapt a pre-trained one using transfer learning. We made many experiments to compare the attack metrics obtained in both cases on various setups (different probe positions, channels, devices, size of datasets). Our results show that in many cases, a lack of training data can be counterbalanced by additional imperfect data coming from another setup

    Detection of Driver Drowsiness and Distraction Using Computer Vision and Machine Learning Approaches

    Get PDF
    Drowsiness and distracted driving are leading factor in most car crashes and near-crashes. This research study explores and investigates the applications of both conventional computer vision and deep learning approaches for the detection of drowsiness and distraction in drivers. In the first part of this MPhil research study conventional computer vision approaches was studied to develop a robust drowsiness and distraction system based on yawning detection, head pose detection and eye blinking detection. These algorithms were implemented by using existing human crafted features. Experiments were performed for the detection and classification with small image datasets to evaluate and measure the performance of system. It was observed that the use of human crafted features together with a robust classifier such as SVM gives better performance in comparison to previous approaches. Though, the results were satisfactorily, there are many drawbacks and challenges associated with conventional computer vision approaches, such as definition and extraction of human crafted features, thus making these conventional algorithms to be subjective in nature and less adaptive in practice. In contrast, deep learning approaches automates the feature selection process and can be trained to learn the most discriminative features without any input from human. In the second half of this research study, the use of deep learning approaches for the detection of distracted driving was investigated. It was observed that one of the advantages of the applied methodology and technique for distraction detection includes and illustrates the contribution of CNN enhancement to a better pattern recognition accuracy and its ability to learn features from various regions of a human body simultaneously. The comparison of the performance of four convolutional deep net architectures (AlexNet, ResNet, MobileNet and NASNet) was carried out, investigated triplet training and explored the impact of combining a support vector classifier (SVC) with a trained deep net. The images used in our experiments with the deep nets are from the State Farm Distracted Driver Detection dataset hosted on Kaggle, each of which captures the entire body of a driver. The best results were obtained with the NASNet trained using triplet loss and combined with an SVC. It was observed that one of the advantages of deep learning approaches are their ability to learn discriminative features from various regions of a human body simultaneously. The ability has enabled deep learning approaches to reach accuracy at human level.
    corecore