62 research outputs found
DeepProteomics: Protein family classification using Shallow and Deep Networks
The knowledge regarding the function of proteins is necessary as it gives a
clear picture of biological processes. Nevertheless, there are many protein
sequences found and added to the databases but lacks functional annotation. The
laboratory experiments take a considerable amount of time for annotation of the
sequences. This arises the need to use computational techniques to classify
proteins based on their functions. In our work, we have collected the data from
Swiss-Prot containing 40433 proteins which is grouped into 30 families. We pass
it to recurrent neural network(RNN), long short term memory(LSTM) and gated
recurrent unit(GRU) model and compare it by applying trigram with deep neural
network and shallow neural network on the same dataset. Through this approach,
we could achieve maximum of around 78% accuracy for the classification of
protein families
DeepImageSpam: Deep Learning based Image Spam Detection
Hackers and spammers are employing innovative and novel techniques to deceive
novice and even knowledgeable internet users. Image spam is one of such
technique where the spammer varies and changes some portion of the image such
that it is indistinguishable from the original image fooling the users. This
paper proposes a deep learning based approach for image spam detection using
the convolutional neural networks which uses a dataset with 810 natural images
and 928 spam images for classification achieving an accuracy of 91.7%
outperforming the existing image processing and machine learning techniquesComment: 4 page
A Compendium on Network and Host based Intrusion Detection Systems
The techniques of deep learning have become the state of the art methodology
for executing complicated tasks from various domains of computer vision,
natural language processing, and several other areas. Due to its rapid
development and promising benchmarks in those fields, researchers started
experimenting with this technique to perform in the area of, especially in
intrusion detection related tasks. Deep learning is a subset and a natural
extension of classical Machine learning and an evolved model of neural
networks. This paper contemplates and discusses all the methodologies related
to the leading edge Deep learning and Neural network models purposing to the
arena of Intrusion Detection Systems.Comment: 8 pages, Accepted for ICDSMLA 201
Vector Space Model as Cognitive Space for Text Classification
In this era of digitization, knowing the user's sociolect aspects have become
essential features to build the user specific recommendation systems. These
sociolect aspects could be found by mining the user's language sharing in the
form of text in social media and reviews. This paper describes about the
experiment that was performed in PAN Author Profiling 2017 shared task. The
objective of the task is to find the sociolect aspects of the users from their
tweets. The sociolect aspects considered in this experiment are user's gender
and native language information. Here user's tweets written in a different
language from their native language are represented as Document - Term Matrix
with document frequency as the constraint. Further classification is done using
the Support Vector Machine by taking gender and native language as target
classes. This experiment attains the average accuracy of 73.42% in gender
prediction and 76.26% in the native language identification task.Comment: 6 pages, 6 figures, 3 table
A short review on Applications of Deep learning for Cyber security
Deep learning is an advanced model of traditional machine learning. This has
the capability to extract optimal feature representation from raw input
samples. This has been applied towards various use cases in cyber security such
as intrusion detection, malware classification, android malware detection, spam
and phishing detection and binary analysis. This paper outlines the survey of
all the works related to deep learning based solutions for various cyber
security use cases. Keywords: Deep learning, intrusion detection, malware
detection, Android malware detection, spam & phishing detection, traffic
analysis, binary analysis.Comment: 15 page
A Deep Learning Approach for Similar Languages, Varieties and Dialects
Deep learning mechanisms are prevailing approaches in recent days for the
various tasks in natural language processing, speech recognition, image
processing and many others. To leverage this we use deep learning based
mechanism specifically Bidirectional- Long Short-Term Memory (B-LSTM) for the
task of dialectic identification in Arabic and German broadcast speech and Long
Short-Term Memory (LSTM) for discriminating between similar Languages. Two
unique B-LSTM models are created using the Large-vocabulary Continuous Speech
Recognition (LVCSR) based lexical features and a fixed length of 400 per
utterance bottleneck features generated by i-vector framework. These models
were evaluated on the VarDial 2017 datasets for the tasks Arabic, German
dialect identification with dialects of Egyptian, Gulf, Levantine, North
African, and MSA for Arabic and Basel, Bern, Lucerne, and Zurich for German.
Also for the task of Discriminating between Similar Languages like Bosnian,
Croatian and Serbian. The B-LSTM model showed accuracy of 0.246 on lexical
features and accuracy of 0.577 bottleneck features of i-Vector framework.Comment: 17 page
A Brief Survey on Autonomous Vehicle Possible Attacks, Exploits and Vulnerabilities
Advanced driver assistance systems are advancing at a rapid pace and all
major companies started investing in developing the autonomous vehicles. But
the security and reliability is still uncertain and debatable. Imagine that a
vehicle is compromised by the attackers and then what they can do. An attacker
can control brake, accelerate and even steering which can lead to catastrophic
consequences. This paper gives a very short and brief overview of most of the
possible attacks on autonomous vehicle software and hardware and their
potential implications.Comment: 5 Pages,1 Figur
Deep Learning Approach for Enhanced Cyber Threat Indicators in Twitter Stream
In recent days, the amount of Cyber Security text data shared via social
media resources mainly Twitter has increased. An accurate analysis of this data
can help to develop cyber threat situational awareness framework for a cyber
threat. This work proposes a deep learning based approach for tweet data
analysis. To convert the tweets into numerical representations, various text
representations are employed. These features are feed into deep learning
architecture for optimal feature extraction as well as classification. Various
hyperparameter tuning approaches are used for identifying optimal text
representation method as well as optimal network parameters and network
structures for deep learning models. For comparative analysis, the classical
text representation method with classical machine learning algorithm is
employed. From the detailed analysis of experiments, we found that the deep
learning architecture with advanced text representation methods performed
better than the classical text representation and classical machine learning
algorithms. The primary reason for this is that the advanced text
representation methods have the capability to learn sequential properties which
exist among the textual data and deep learning architectures learns the optimal
features along with decreasing the feature size.Comment: 11 page
Deep Learning Approach for Intelligent Named Entity Recognition of Cyber Security
In recent years, the amount of Cyber Security data generated in the form of
unstructured texts, for example, social media resources, blogs, articles, and
so on has exceptionally increased. Named Entity Recognition (NER) is an initial
step towards converting this unstructured data into structured data which can
be used by a lot of applications. The existing methods on NER for Cyber
Security data are based on rules and linguistic characteristics. A Deep
Learning (DL) based approach embedded with Conditional Random Fields (CRFs) is
proposed in this paper. Several DL architectures are evaluated to find the most
optimal architecture. The combination of Bidirectional Gated Recurrent Unit
(Bi-GRU), Convolutional Neural Network (CNN), and CRF performed better compared
to various other DL frameworks on a publicly available benchmark dataset. This
may be due to the reason that the bidirectional structures preserve the
features related to the future and previous words in a sequence.Comment: 10 page
Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis
Deep learning is a state of the art method for a lot of applications. The
main issue is that most of the real-time data is highly imbalanced in nature.
In order to avoid bias in training, cost-sensitive approach can be used. In
this paper, we propose cost-sensitive deep learning based frameworks and the
performance of the frameworks is evaluated on three different Cyber Security
use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email),
and Uniform Resource Locator (URL). Various experiments were performed using
cost-insensitive as well as cost-sensitive methods and parameters for both of
these methods are set based on hyperparameter tuning. In all experiments, the
cost-sensitive deep learning methods performed better than the cost-insensitive
approaches. This is mainly due to the reason that cost-sensitive approach gives
importance to the classes which have a very less number of samples during
training and this helps to learn all the classes in a more efficient manner.Comment: 12 page
- …