Search CORE

117,096 research outputs found

DeepHTTP: Semantics-Structure Model with Attention for Anomalous HTTP Traffic Detection and Pattern Mining

Author: Guan Hongchao
Yan Hanbing
Yu Yuqi
Zhou Hao
Publication venue
Publication date: 30/10/2018
Field of study

In the Internet age, cyber-attacks occur frequently with complex types. Traffic generated by access activities can record website status and user request information, which brings a great opportunity for network attack detection. Among diverse network protocols, Hypertext Transfer Protocol (HTTP) is widely used in government, organizations and enterprises. In this work, we propose DeepHTTP, a semantics structure integration model utilizing Bidirectional Long Short-Term Memory (Bi-LSTM) with attention mechanism to model HTTP traffic as a natural language sequence. In addition to extracting traffic content information, we integrate structural information to enhance the generalization capabilities of the model. Moreover, the application of attention mechanism can assist in discovering critical parts of anomalous traffic and further mining attack patterns. Additionally, we demonstrate how to incrementally update the data set and retrain model so that it can be adapted to new anomalous traffic. Extensive experimental evaluations over large traffic data have illustrated that DeepHTTP has outstanding performance in traffic detection and pattern discovery

arXiv.org e-Print Archive

Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension

Author: Liu Jiachen
Lyu Yajuan
Wang Zhen
Wu Tian
Xiao Xinyan
Publication venue
Publication date: 16/05/2018
Field of study

While sophisticated neural-based techniques have been developed in reading comprehension, most approaches model the answer in an independent manner, ignoring its relations with other answer candidates. This problem can be even worse in open-domain scenarios, where candidates from multiple passages should be combined to answer a single question. In this paper, we formulate reading comprehension as an extract-then-select two-stage procedure. We first extract answer candidates from passages, then select the final answer by combining information from all the candidates. Furthermore, we regard candidate extraction as a latent variable and train the two-stage process jointly with reinforcement learning. As a result, our approach has improved the state-of-the-art performance significantly on two challenging open-domain reading comprehension datasets. Further analysis demonstrates the effectiveness of our model components, especially the information fusion of all the candidates and the joint training of the extract-then-select procedure.Comment: 10 pages, Accepted by ACL 201

arXiv.org e-Print Archive

Attention-based Natural Language Person Retrieval

Author: Chen Muhao
Terzopoulos Demetri
Yu Jie
Zhou Tao
Publication venue
Publication date: 24/05/2017
Field of study

Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the goal is to localize the person in an image. To this end, we first construct a benchmark dataset for natural language person retrieval. To do so, we generate bounding boxes for persons in a public image dataset from the segmentation masks, which are then annotated with descriptions and attributes using the Amazon Mechanical Turk. We then adopt a region proposal network in Faster R-CNN as a candidate region generator. The cropped images based on the region proposals as well as the whole images with attention weights are fed into Convolutional Neural Networks for visual feature extraction, while the natural language expression and attributes are input to Bidirectional Long Short- Term Memory (BLSTM) models for text feature extraction. The visual and text features are integrated to score region proposals, and the one with the highest score is retrieved as the output of our system. The experimental results show significant improvement over the state-of-the-art method for generic object retrieval and this line of research promises to benefit search in surveillance video footage.Comment: CVPR 2017 Workshop (vision meets cognition

arXiv.org e-Print Archive

Dataset Construction via Attention for Aspect Term Extraction with Distant Supervision

Author: Antognini Diego
Baeriswyl Michael
Giannakopoulos Athanasios
Hossmann Andreea
Musat Claudiu
Publication venue
Publication date: 26/09/2017
Field of study

Aspect Term Extraction (ATE) detects opinionated aspect terms in sentences or text spans, with the end goal of performing aspect-based sentiment analysis. The small amount of available datasets for supervised ATE and the fact that they cover only a few domains raise the need for exploiting other data sources in new and creative ways. Publicly available review corpora contain a plethora of opinionated aspect terms and cover a larger domain spectrum. In this paper, we first propose a method for using such review corpora for creating a new dataset for ATE. Our method relies on an attention mechanism to select sentences that have a high likelihood of containing actual opinionated aspects. We thus improve the quality of the extracted aspects. We then use the constructed dataset to train a model and perform ATE with distant supervision. By evaluating on human annotated datasets, we prove that our method achieves a significantly improved performance over various unsupervised and supervised baselines. Finally, we prove that sentence selection matters when it comes to creating new datasets for ATE. Specifically, we show that, using a set of selected sentences leads to higher ATE performance compared to using the whole sentence set

arXiv.org e-Print Archive

An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering

Author: fu Kun
Huang Tinglei
Liang Xiao
Xu Guandong
Zhang Hongzhi
Publication venue
Publication date: 30/01/2018
Field of study

Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201

arXiv.org e-Print Archive

LSTM-based Deep Learning Models for Non-factoid Answer Selection

Author: Santos Cicero dos
Tan Ming
Xiang Bing
Zhou Bowen
Publication venue
Publication date: 28/03/2016
Field of study

In this paper, we apply a general deep learning (DL) framework for the answer selection task, which does not depend on manually defined features or linguistic tools. The basic framework is to build the embeddings of questions and answers based on bidirectional long short-term memory (biLSTM) models, and measure their closeness by cosine similarity. We further extend this basic model in two directions. One direction is to define a more composite representation for questions and answers by combining convolutional neural network with the basic framework. The other direction is to utilize a simple but efficient attention mechanism in order to generate the answer representation according to the question context. Several variations of models are provided. The models are examined by two datasets, including TREC-QA and InsuranceQA. Experimental results demonstrate that the proposed models substantially outperform several strong baselines.Comment: added new experiments on TREC-Q

arXiv.org e-Print Archive

Causality Extraction based on Self-Attentive BiLSTM-CRF with Transferred Embeddings

Author: Li Qi
Li Zhaoning
Ren Jiangtao
Zou Xiaotian
Publication venue: 'Elsevier BV'
Publication date: 08/11/2020
Field of study

Causality extraction from natural language texts is a challenging open problem in artificial intelligence. Existing methods utilize patterns, constraints, and machine learning techniques to extract causality, heavily depending on domain knowledge and requiring considerable human effort and time for feature engineering. In this paper, we formulate causality extraction as a sequence labeling problem based on a novel causality tagging scheme. On this basis, we propose a neural causality extractor with the BiLSTM-CRF model as the backbone, named SCITE (Self-attentive BiLSTM-CRF wIth Transferred Embeddings), which can directly extract cause and effect without extracting candidate causal pairs and identifying their relations separately. To address the problem of data insufficiency, we transfer contextual string embeddings, also known as Flair embeddings, which are trained on a large corpus in our task. In addition, to improve the performance of causality extraction, we introduce a multihead self-attention mechanism into SCITE to learn the dependencies between causal words. We evaluate our method on a public dataset, and experimental results demonstrate that our method achieves significant and consistent improvement compared to baselines.Comment: 39 pages, 11 figures, 6 table

arXiv.org e-Print Archive

Deep Learning for Sentiment Analysis : A Survey

Author: Liu Bing
Wang Shuai
Zhang Lei
Publication venue
Publication date: 30/01/2018
Field of study

Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.Comment: 34 pages, 9 figures, 2 table

arXiv.org e-Print Archive

CANDiS: Coupled & Attention-Driven Neural Distant Supervision

Author: Nagarajan Tushar
Sharmistha
Talukdar Partha
Publication venue
Publication date: 26/10/2017
Field of study

Distant Supervision for Relation Extraction uses heuristically aligned text data with an existing knowledge base as training data. The unsupervised nature of this technique allows it to scale to web-scale relation extraction tasks, at the expense of noise in the training data. Previous work has explored relationships among instances of the same entity-pair to reduce this noise, but relationships among instances across entity-pairs have not been fully exploited. We explore the use of inter-instance couplings based on verb-phrase and entity type similarities. We propose a novel technique, CANDiS, which casts distant supervision using inter-instance coupling into an end-to-end neural network model. CANDiS incorporates an attention module at the instance-level to model the multi-instance nature of this problem. CANDiS outperforms existing state-of-the-art techniques on a standard benchmark dataset.Comment: WiNLP 201

arXiv.org e-Print Archive

Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection

Author: Lee Hung-yi
Shen Sheng-syun
Publication venue
Publication date: 31/03/2016
Field of study

Recurrent neural network architectures combining with attention mechanism, or neural attention model, have shown promising performance recently for the tasks including speech recognition, image caption generation, visual question answering and machine translation. In this paper, neural attention model is applied on two sequence classification tasks, dialogue act detection and key term extraction. In the sequence labeling tasks, the model input is a sequence, and the output is the label of the input sequence. The major difficulty of sequence labeling is that when the input sequence is long, it can include many noisy or irrelevant part. If the information in the whole sequence is treated equally, the noisy or irrelevant part may degrade the classification performance. The attention mechanism is helpful for sequence classification task because it is capable of highlighting important part among the entire sequence for the classification task. The experimental results show that with the attention mechanism, discernible improvements were achieved in the sequence labeling task considered here. The roles of the attention mechanism in the tasks are further analyzed and visualized in this paper.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive