208 research outputs found
Autoencoders and Generative Adversarial Networks for Imbalanced Sequence Classification
Generative Adversarial Networks (GANs) have been used in many different
applications to generate realistic synthetic data. We introduce a novel GAN
with Autoencoder (GAN-AE) architecture to generate synthetic samples for
variable length, multi-feature sequence datasets. In this model, we develop a
GAN architecture with an additional autoencoder component, where recurrent
neural networks (RNNs) are used for each component of the model in order to
generate synthetic data to improve classification accuracy for a highly
imbalanced medical device dataset. In addition to the medical device dataset,
we also evaluate the GAN-AE performance on two additional datasets and
demonstrate the application of GAN-AE to a sequence-to-sequence task where both
synthetic sequence inputs and sequence outputs must be generated. To evaluate
the quality of the synthetic data, we train encoder-decoder models both with
and without the synthetic data and compare the classification model
performance. We show that a model trained with GAN-AE generated synthetic data
outperforms models trained with synthetic data generated both with standard
oversampling techniques such as SMOTE and Autoencoders as well as with state of
the art GAN-based models
Resonant anomaly detection without background sculpting
We introduce a new technique named Latent CATHODE (LaCATHODE) for performing
"enhanced bump hunts", a type of resonant anomaly search that combines
conventional one-dimensional bump hunts with a model-agnostic anomaly score in
an auxiliary feature space where potential signals could also be localized. The
main advantage of LaCATHODE over existing methods is that it provides an
anomaly score that is well behaved when evaluating it beyond the signal region,
which is essential to prevent the sculpting of background distributions in the
bump hunt. LaCATHODE accomplishes this by constructing the anomaly score
directly in the latent space learned by a conditional normalizing flow trained
on sideband regions. We demonstrate the superior stability and comparable
performance of LaCATHODE for enhanced bump hunting in an illustrative toy
example as well as on the LHC Olympics R&D dataset.Comment: 11 pages, 8 figures; v2 (published version): referencing code and
minor style update
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Transfer Learning with Deep Convolutional Neural Network (CNN) for Pneumonia Detection using Chest X-ray
Pneumonia is a life-threatening disease, which occurs in the lungs caused by
either bacterial or viral infection. It can be life-endangering if not acted
upon in the right time and thus an early diagnosis of pneumonia is vital. The
aim of this paper is to automatically detect bacterial and viral pneumonia
using digital x-ray images. It provides a detailed report on advances made in
making accurate detection of pneumonia and then presents the methodology
adopted by the authors. Four different pre-trained deep Convolutional Neural
Network (CNN)- AlexNet, ResNet18, DenseNet201, and SqueezeNet were used for
transfer learning. 5247 Bacterial, viral and normal chest x-rays images
underwent preprocessing techniques and the modified images were trained for the
transfer learning based classification task. In this work, the authors have
reported three schemes of classifications: normal vs pneumonia, bacterial vs
viral pneumonia and normal, bacterial and viral pneumonia. The classification
accuracy of normal and pneumonia images, bacterial and viral pneumonia images,
and normal, bacterial and viral pneumonia were 98%, 95%, and 93.3%
respectively. This is the highest accuracy in any scheme than the accuracies
reported in the literature. Therefore, the proposed study can be useful in
faster-diagnosing pneumonia by the radiologist and can help in the fast airport
screening of pneumonia patients.Comment: 13 Figures, 5 tables. arXiv admin note: text overlap with
arXiv:2003.1314
Social media bot detection with deep learning methods: a systematic review
Social bots are automated social media accounts governed by software and controlled by humans at the backend. Some bots have good purposes, such as automatically posting information about news and even to provide help during emergencies. Nevertheless, bots have also been used for malicious purposes, such as for posting fake news or rumour spreading or manipulating political campaigns. There are existing mechanisms that allow for detection and removal of malicious bots automatically. However, the bot landscape changes as the bot creators use more sophisticated methods to avoid being detected. Therefore, new mechanisms for discerning between legitimate and bot accounts are much needed. Over the past few years, a few review studies contributed to the social media bot detection research by presenting a comprehensive survey on various detection methods including cutting-edge solutions like machine learning (ML)/deep learning (DL) techniques. This paper, to the best of our knowledge, is the first one to only highlight the DL techniques and compare the motivation/effectiveness of these techniques among themselves and over other methods, especially the traditional ML ones. We present here a refined taxonomy of the features used in DL studies and details about the associated pre-processing strategies required to make suitable training data for a DL model. We summarize the gaps addressed by the review papers that mentioned about DL/ML studies to provide future directions in this field. Overall, DL techniques turn out to be computation and time efficient techniques for social bot detection with better or compatible performance as traditional ML techniques
Evaluation of Classification Algorithms for Intrusion Detection System: A Review
Intrusion detection is one of the most critical network security problems in the technology world. Machine learning techniques are being implemented to improve the Intrusion Detection System (IDS). In order to enhance the performance of IDS, different classification algorithms are applied to detect various types of attacks. Choosing a suitable classification algorithm for building IDS is not an easy task. The best method is to test the performance of the different classification algorithms. This paper aims to present the result of evaluating different classification algorithms to build an IDS model in terms of confusion matrix, accuracy, recall, precision, f-score, specificity and sensitivity. Nevertheless, most researchers have focused on the confusion matrix and accuracy metric as measurements of classification performance. It also provides a detailed comparison with the dataset, data preprocessing, number of features selected, feature selection technique, classification algorithms, and evaluation performance of algorithms described in the intrusion detection system
- …