117,096 research outputs found
DeepHTTP: Semantics-Structure Model with Attention for Anomalous HTTP Traffic Detection and Pattern Mining
In the Internet age, cyber-attacks occur frequently with complex types.
Traffic generated by access activities can record website status and user
request information, which brings a great opportunity for network attack
detection. Among diverse network protocols, Hypertext Transfer Protocol (HTTP)
is widely used in government, organizations and enterprises. In this work, we
propose DeepHTTP, a semantics structure integration model utilizing
Bidirectional Long Short-Term Memory (Bi-LSTM) with attention mechanism to
model HTTP traffic as a natural language sequence. In addition to extracting
traffic content information, we integrate structural information to enhance the
generalization capabilities of the model. Moreover, the application of
attention mechanism can assist in discovering critical parts of anomalous
traffic and further mining attack patterns. Additionally, we demonstrate how to
incrementally update the data set and retrain model so that it can be adapted
to new anomalous traffic. Extensive experimental evaluations over large traffic
data have illustrated that DeepHTTP has outstanding performance in traffic
detection and pattern discovery
Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension
While sophisticated neural-based techniques have been developed in reading
comprehension, most approaches model the answer in an independent manner,
ignoring its relations with other answer candidates. This problem can be even
worse in open-domain scenarios, where candidates from multiple passages should
be combined to answer a single question. In this paper, we formulate reading
comprehension as an extract-then-select two-stage procedure. We first extract
answer candidates from passages, then select the final answer by combining
information from all the candidates. Furthermore, we regard candidate
extraction as a latent variable and train the two-stage process jointly with
reinforcement learning. As a result, our approach has improved the
state-of-the-art performance significantly on two challenging open-domain
reading comprehension datasets. Further analysis demonstrates the effectiveness
of our model components, especially the information fusion of all the
candidates and the joint training of the extract-then-select procedure.Comment: 10 pages, Accepted by ACL 201
Attention-based Natural Language Person Retrieval
Following the recent progress in image classification and captioning using
deep learning, we develop a novel natural language person retrieval system
based on an attention mechanism. More specifically, given the description of a
person, the goal is to localize the person in an image. To this end, we first
construct a benchmark dataset for natural language person retrieval. To do so,
we generate bounding boxes for persons in a public image dataset from the
segmentation masks, which are then annotated with descriptions and attributes
using the Amazon Mechanical Turk. We then adopt a region proposal network in
Faster R-CNN as a candidate region generator. The cropped images based on the
region proposals as well as the whole images with attention weights are fed
into Convolutional Neural Networks for visual feature extraction, while the
natural language expression and attributes are input to Bidirectional Long
Short- Term Memory (BLSTM) models for text feature extraction. The visual and
text features are integrated to score region proposals, and the one with the
highest score is retrieved as the output of our system. The experimental
results show significant improvement over the state-of-the-art method for
generic object retrieval and this line of research promises to benefit search
in surveillance video footage.Comment: CVPR 2017 Workshop (vision meets cognition
Dataset Construction via Attention for Aspect Term Extraction with Distant Supervision
Aspect Term Extraction (ATE) detects opinionated aspect terms in sentences or
text spans, with the end goal of performing aspect-based sentiment analysis.
The small amount of available datasets for supervised ATE and the fact that
they cover only a few domains raise the need for exploiting other data sources
in new and creative ways. Publicly available review corpora contain a plethora
of opinionated aspect terms and cover a larger domain spectrum. In this paper,
we first propose a method for using such review corpora for creating a new
dataset for ATE. Our method relies on an attention mechanism to select
sentences that have a high likelihood of containing actual opinionated aspects.
We thus improve the quality of the extracted aspects. We then use the
constructed dataset to train a model and perform ATE with distant supervision.
By evaluating on human annotated datasets, we prove that our method achieves a
significantly improved performance over various unsupervised and supervised
baselines. Finally, we prove that sentence selection matters when it comes to
creating new datasets for ATE. Specifically, we show that, using a set of
selected sentences leads to higher ATE performance compared to using the whole
sentence set
An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering
Relation detection plays a crucial role in Knowledge Base Question Answering
(KBQA) because of the high variance of relation expression in the question.
Traditional deep learning methods follow an encoding-comparing paradigm, where
the question and the candidate relation are represented as vectors to compare
their semantic similarity. Max- or average- pooling operation, which compresses
the sequence of words into fixed-dimensional vectors, becomes the bottleneck of
information. In this paper, we propose to learn attention-based word-level
interactions between questions and relations to alleviate the bottleneck issue.
Similar to the traditional models, the question and relation are firstly
represented as sequences of vectors. Then, instead of merging the sequence into
a single vector with pooling operation, soft alignments between words from the
question and the relation are learned. The aligned words are subsequently
compared with the convolutional neural network (CNN) and the comparison results
are merged finally. Through performing the comparison on low-level
representations, the attention-based word-level interaction model (ABWIM)
relieves the information loss issue caused by merging the sequence into a
fixed-dimensional vector before the comparison. The experimental results of
relation detection on both SimpleQuestions and WebQuestions datasets show that
ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201
LSTM-based Deep Learning Models for Non-factoid Answer Selection
In this paper, we apply a general deep learning (DL) framework for the answer
selection task, which does not depend on manually defined features or
linguistic tools. The basic framework is to build the embeddings of questions
and answers based on bidirectional long short-term memory (biLSTM) models, and
measure their closeness by cosine similarity. We further extend this basic
model in two directions. One direction is to define a more composite
representation for questions and answers by combining convolutional neural
network with the basic framework. The other direction is to utilize a simple
but efficient attention mechanism in order to generate the answer
representation according to the question context. Several variations of models
are provided. The models are examined by two datasets, including TREC-QA and
InsuranceQA. Experimental results demonstrate that the proposed models
substantially outperform several strong baselines.Comment: added new experiments on TREC-Q
Causality Extraction based on Self-Attentive BiLSTM-CRF with Transferred Embeddings
Causality extraction from natural language texts is a challenging open
problem in artificial intelligence. Existing methods utilize patterns,
constraints, and machine learning techniques to extract causality, heavily
depending on domain knowledge and requiring considerable human effort and time
for feature engineering. In this paper, we formulate causality extraction as a
sequence labeling problem based on a novel causality tagging scheme. On this
basis, we propose a neural causality extractor with the BiLSTM-CRF model as the
backbone, named SCITE (Self-attentive BiLSTM-CRF wIth Transferred Embeddings),
which can directly extract cause and effect without extracting candidate causal
pairs and identifying their relations separately. To address the problem of
data insufficiency, we transfer contextual string embeddings, also known as
Flair embeddings, which are trained on a large corpus in our task. In addition,
to improve the performance of causality extraction, we introduce a multihead
self-attention mechanism into SCITE to learn the dependencies between causal
words. We evaluate our method on a public dataset, and experimental results
demonstrate that our method achieves significant and consistent improvement
compared to baselines.Comment: 39 pages, 11 figures, 6 table
Deep Learning for Sentiment Analysis : A Survey
Deep learning has emerged as a powerful machine learning technique that
learns multiple layers of representations or features of the data and produces
state-of-the-art prediction results. Along with the success of deep learning in
many other application domains, deep learning is also popularly used in
sentiment analysis in recent years. This paper first gives an overview of deep
learning and then provides a comprehensive survey of its current applications
in sentiment analysis.Comment: 34 pages, 9 figures, 2 table
CANDiS: Coupled & Attention-Driven Neural Distant Supervision
Distant Supervision for Relation Extraction uses heuristically aligned text
data with an existing knowledge base as training data. The unsupervised nature
of this technique allows it to scale to web-scale relation extraction tasks, at
the expense of noise in the training data. Previous work has explored
relationships among instances of the same entity-pair to reduce this noise, but
relationships among instances across entity-pairs have not been fully
exploited. We explore the use of inter-instance couplings based on verb-phrase
and entity type similarities. We propose a novel technique, CANDiS, which casts
distant supervision using inter-instance coupling into an end-to-end neural
network model. CANDiS incorporates an attention module at the instance-level to
model the multi-instance nature of this problem. CANDiS outperforms existing
state-of-the-art techniques on a standard benchmark dataset.Comment: WiNLP 201
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection
Recurrent neural network architectures combining with attention mechanism, or
neural attention model, have shown promising performance recently for the tasks
including speech recognition, image caption generation, visual question
answering and machine translation. In this paper, neural attention model is
applied on two sequence classification tasks, dialogue act detection and key
term extraction. In the sequence labeling tasks, the model input is a sequence,
and the output is the label of the input sequence. The major difficulty of
sequence labeling is that when the input sequence is long, it can include many
noisy or irrelevant part. If the information in the whole sequence is treated
equally, the noisy or irrelevant part may degrade the classification
performance. The attention mechanism is helpful for sequence classification
task because it is capable of highlighting important part among the entire
sequence for the classification task. The experimental results show that with
the attention mechanism, discernible improvements were achieved in the sequence
labeling task considered here. The roles of the attention mechanism in the
tasks are further analyzed and visualized in this paper.Comment: 5 pages, 2 figure
- …