10 research outputs found

    SR-POD : sample rotation based on principal-axis orientation distribution for data augmentation in deep object detection

    Get PDF
    Convolutional neural networks (CNNs) have outperformed most state-of-the-art methods in object detection. However, CNNs suffer the difficulty of detecting objects with rotation, because the dataset used to train the CCNs often does not contain sufficient samples with various angles of orientation. In this paper, we propose a novel data-augmentation approach to handle samples with rotation, which utilizes the distribution of the object's orientation without the time-consuming process of rotating the sample images. Firstly, we present an orientation descriptor, named as "principal-axis orientation" to describe the orientation of the object's principal axis in an image and estimate the distribution of objects’ principal-axis orientations (PODs) of the whole dataset. Secondly, we define a similarity metric to calculate the POD similarity between the training set and an additional dataset, which is built by randomly selecting images from the benchmark ImageNet ILSVRC2012 dataset. Finally, we optimize a cost function to obtain an optimal rotation angle, which indicates the highest POD similarity between the two aforementioned data sets. In order to evaluate our data augmentation method for object detection, experiments, conducted on the benchmark PASCAL VOC2007 dataset, show that with the training set augmented using our method, the average precision (AP) of the Faster RCNN in the TV-monitor is improved by 7.5%. In addition, our experimental results also demonstrate that new samples generated by random rotation are more likely to result in poor performance of object detection

    Hybrid Deep Neural Network for Facial Expressions Recognition

    Get PDF
    Facial expressions are critical indicators of human emotions where recognizing facial expressions has captured the attention of many academics, and recognition of expressions in natural situations remains a challenge due to differences in head position, occlusion, and illumination. Several studies have focused on recognizing emotions from frontal images only, while in this paper wild images from the FER2013 dataset have been used to make a more generalizing model with the existence of its challenges, it is among the most difficult datasets that only got 65.5 % accuracy human-level. This paper proposed a model for recognizing facial expressions using pre-trained deep convolutional neural networks and the technique of transfer learning. this hybrid model used a combination of two pre-trained deep convolutional neural networks, training the model in multiple cases for more efficiency to categorize the facial expressions into seven classes. The results show that the best accuracy of the suggested models is 74.39%  for the hybrid model, and 73.33% for Fine-tuned the single EfficientNetB0 model, while the highest accuracy for previous methods was 73.28%. Thus, the hybrid and single models outperform other state of art classification methods without using any additional, the hybrid and single models ranked in the first and second position among these methods. Also, The hybrid model has even outperformed the second-highest in accuracy method which used extra data. The incorrectly labeled images in the dataset unfairly reduce accuracy but our best model recognized their actual classes correctly

    Investigation of Dual-Flow Deep Learning Models LSTM-FCN and GRU-FCN Efficiency against Single-Flow CNN Models for the Host-Based Intrusion and Malware Detection Task on Univariate Times Series Data

    Get PDF
    Intrusion and malware detection tasks on a host level are a critical part of the overall information security infrastructure of a modern enterprise. While classical host-based intrusion detection systems (HIDS) and antivirus (AV) approaches are based on change monitoring of critical files and malware signatures, respectively, some recent research, utilizing relatively vanilla deep learning (DL) methods, has demonstrated promising anomaly-based detection results that already have practical applicability due low false positive rate (FPR). More complex DL methods typically provide better results in natural language processing and image recognition tasks. In this paper, we analyze applicability of more complex dual-flow DL methods, such as long short-term memory fully convolutional network (LSTM-FCN), gated recurrent unit (GRU)-FCN, and several others, for the task specified on the attack-caused Windows OS system calls traces dataset (AWSCTD) and compare it with vanilla single-flow convolutional neural network (CNN) models. The results obtained do not demonstrate any advantages of dual-flow models while processing univariate times series data and introducing unnecessary level of complexity, increasing training, and anomaly detection time, which is crucial in the intrusion containment process. On the other hand, the newly tested AWSCTD-CNN-static (S) single-flow model demonstrated three times better training and testing times, preserving the high detection accuracy.This article belongs to the Special Issue Machine Learning for Cybersecurity Threats, Challenges, and Opportunitie

    Improving Person-Independent Facial Expression Recognition Using Deep Learning

    Get PDF
    Over the past few years, deep learning, e.g., Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), have shown promise on facial expression recog- nition. However, the performance degrades dramatically especially in close-to-real-world settings due to high intra-class variations and high inter-class similarities introduced by subtle facial appearance changes, head pose variations, illumination changes, occlusions, and identity-related attributes, e.g., age, race, and gender. In this work, we developed two novel CNN frameworks and one novel GAN approach to learn discriminative features for facial expression recognition. First, a novel island loss is proposed to enhance the discriminative power of learned deep features. Specifically, the island loss is designed to reduce the intra-class variations while enlarging the inter-class differences simultaneously. Experimental results on three posed facial expression datasets and, more importantly, two spontaneous facial expression datasets have shown that the proposed island loss outperforms the baseline CNNs with the traditional softmax loss or the center loss and achieves better or at least comparable performance compared with the state-of-the-art methods. Second, we proposed a novel Probabilistic Attribute Tree-CNN (PAT-CNN) to explic- itly deal with the large intra-class variations caused by identity-related attributes. Specif- ically, a novel PAT module with an associated PAT loss was proposed to learn features in a hierarchical tree structure organized according to identity-related attributes, where the final features are less affected by the attributes. We further proposed a semi-supervised strategy to learn the PAT-CNN from limited attribute-annotated samples to make the best use of available data. Experimental results on three posed facial expression datasets as well as four spontaneous facial expression datasets have demonstrated that the proposed PAT- CNN achieves the best performance compared with state-of-the-art methods by explicitly modeling attributes. Impressively, the PAT-CNN using a single model achieves the best performance on the SFEW test dataset, compared with the state-of-the-art methods using an ensemble of hundreds of CNNs. Last, we present a novel Identity-Free conditional Generative Adversarial Network (IF- GAN) to explicitly reduce high inter-subject variations caused by identity-related attributes, e.g., age, race, and gender, for facial expression recognition. Specifically, for any given in- put facial expression image, a conditional generative model was developed to transform it to an “average” identity expressive face with the same expression as the input face image. Since the generated images have the same synthetic “average” identity, they differ from each other only by the displayed expressions and thus can be used for identity-free facial expression classification. In this work, an end-to-end system was developed to perform facial expression generation and facial expression recognition in the IF-GAN framework. Experimental results on four well-known facial expression datasets including a sponta- neous facial expression dataset have demonstrated that the proposed IF-GAN outperforms the baseline CNN model and achieves the best performance compared with the state-of- the-art methods for facial expression recognition

    Deep learning for time series classification

    Full text link
    Time series analysis is a field of data science which is interested in analyzing sequences of numerical values ordered in time. Time series are particularly interesting because they allow us to visualize and understand the evolution of a process over time. Their analysis can reveal trends, relationships and similarities across the data. There exists numerous fields containing data in the form of time series: health care (electrocardiogram, blood sugar, etc.), activity recognition, remote sensing, finance (stock market price), industry (sensors), etc. Time series classification consists of constructing algorithms dedicated to automatically label time series data. The sequential aspect of time series data requires the development of algorithms that are able to harness this temporal property, thus making the existing off-the-shelf machine learning models for traditional tabular data suboptimal for solving the underlying task. In this context, deep learning has emerged in recent years as one of the most effective methods for tackling the supervised classification task, particularly in the field of computer vision. The main objective of this thesis was to study and develop deep neural networks specifically constructed for the classification of time series data. We thus carried out the first large scale experimental study allowing us to compare the existing deep methods and to position them compared other non-deep learning based state-of-the-art methods. Subsequently, we made numerous contributions in this area, notably in the context of transfer learning, data augmentation, ensembling and adversarial attacks. Finally, we have also proposed a novel architecture, based on the famous Inception network (Google), which ranks among the most efficient to date.Comment: PhD thesi
    corecore