7,146 research outputs found
Any-gram Kernels for Sentence Classification: A Sentiment Analysis Case Study
Any-gram kernels are a flexible and efficient way to employ bag-of-n-gram
features when learning from textual data. They are also compatible with the use
of word embeddings so that word similarities can be accounted for. While the
original any-gram kernels are implemented on top of tree kernels, we propose a
new approach which is independent of tree kernels and is more efficient. We
also propose a more effective way to make use of word embeddings than the
original any-gram formulation. When applied to the task of sentiment
classification, our new formulation achieves significantly better performance
Relation Extraction : A Survey
With the advent of the Internet, large amount of digital text is generated
everyday in the form of news articles, research publications, blogs, question
answering forums and social media. It is important to develop techniques for
extracting information automatically from these documents, as lot of important
information is hidden within them. This extracted information can be used to
improve access and management of knowledge hidden in large text corpora.
Several applications such as Question Answering, Information Retrieval would
benefit from this information. Entities like persons and organizations, form
the most basic unit of the information. Occurrences of entities in a sentence
are often linked through well-defined relations; e.g., occurrences of person
and organization in a sentence may be linked through relations such as employed
at. The task of Relation Extraction (RE) is to identify such relations
automatically. In this paper, we survey several important supervised,
semi-supervised and unsupervised RE techniques. We also cover the paradigms of
Open Information Extraction (OIE) and Distant Supervision. Finally, we describe
some of the recent trends in the RE techniques and possible future research
directions. This survey would be useful for three kinds of readers - i)
Newcomers in the field who want to quickly learn about RE; ii) Researchers who
want to know how the various RE techniques evolved over time and what are
possible future research directions and iii) Practitioners who just need to
know which RE technique works best in various settings
Multi-lingual Dialogue Act Recognition with Deep Learning Methods
This paper deals with multi-lingual dialogue act (DA) recognition. The
proposed approaches are based on deep neural networks and use word2vec
embeddings for word representation. Two multi-lingual models are proposed for
this task. The first approach uses one general model trained on the embeddings
from all available languages. The second method trains the model on a single
pivot language and a linear transformation method is used to project other
languages onto the pivot language. The popular convolutional neural network and
LSTM architectures with different set-ups are used as classifiers. To the best
of our knowledge this is the first attempt at multi-lingual DA recognition
using neural networks. The multi-lingual models are validated experimentally on
two languages from the Verbmobil corpus
An Attention-Gated Convolutional Neural Network for Sentence Classification
The classification of sentences is very challenging, since sentences contain
the limited contextual information. In this paper, we proposed an
Attention-Gated Convolutional Neural Network (AGCNN) for sentence
classification, which generates attention weights from the feature's context
windows of different sizes by using specialized convolution encoders. It makes
full use of limited contextual information to extract and enhance the influence
of important features in predicting the sentence's category. Experimental
results demonstrated that our model can achieve up to 3.1% higher accuracy than
standard CNN models, and gain competitive results over the baselines on four
out of the six tasks. Besides, we designed an activation function, namely,
Natural Logarithm rescaled Rectified Linear Unit (NLReLU). Experiments showed
that NLReLU can outperform ReLU and is comparable to other well-known
activation functions on AGCNN.Comment: Accepted for publication in the Intelligent Data Analysis journal, 19
pages, 4 figures and 5 table
Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval
Little research focuses on cross-modal correlation learning where temporal
structures of different data modalities such as audio and lyrics are taken into
account. Stemming from the characteristic of temporal structures of music in
nature, we are motivated to learn the deep sequential correlation between audio
and lyrics. In this work, we propose a deep cross-modal correlation learning
architecture involving two-branch deep neural networks for audio modality and
text modality (lyrics). Different modality data are converted to the same
canonical space where inter modal canonical correlation analysis is utilized as
an objective function to calculate the similarity of temporal structures. This
is the first study on understanding the correlation between language and music
audio through deep architectures for learning the paired temporal correlation
of audio and lyrics. Pre-trained Doc2vec model followed by fully-connected
layers (fully-connected deep neural network) is used to represent lyrics. Two
significant contributions are made in the audio branch, as follows: i)
pre-trained CNN followed by fully-connected layers is investigated for
representing music audio. ii) We further suggest an end-to-end architecture
that simultaneously trains convolutional layers and fully-connected layers to
better learn temporal structures of music audio. Particularly, our end-to-end
deep architecture contains two properties: simultaneously implementing feature
learning and cross-modal correlation learning, and learning joint
representation by considering temporal structures. Experimental results, using
audio to retrieve lyrics or using lyrics to retrieve audio, verify the
effectiveness of the proposed deep correlation learning architectures in
cross-modal music retrieval
eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys
For years security machine learning research has promised to obviate the need
for signature based detection by automatically learning to detect indicators of
attack. Unfortunately, this vision hasn't come to fruition: in fact, developing
and maintaining today's security machine learning systems can require
engineering resources that are comparable to that of signature-based detection
systems, due in part to the need to develop and continuously tune the
"features" these machine learning systems look at as attacks evolve. Deep
learning, a subfield of machine learning, promises to change this by operating
on raw input signals and automating the process of feature design and
extraction. In this paper we propose the eXpose neural network, which uses a
deep learning approach we have developed to take generic, raw short character
strings as input (a common case for security inputs, which include artifacts
like potentially malicious URLs, file paths, named pipes, named mutexes, and
registry keys), and learns to simultaneously extract features and classify
using character-level embeddings and convolutional neural network. In addition
to completely automating the feature design and extraction process, eXpose
outperforms manual feature extraction based baselines on all of the intrusion
detection problems we tested it on, yielding a 5%-10% detection rate gain at
0.1% false positive rate compared to these baselines
Unsupervised Document Embedding With CNNs
We propose a new model for unsupervised document embedding. Leading existing
approaches either require complex inference or use recurrent neural networks
(RNN) that are difficult to parallelize. We take a different route and develop
a convolutional neural network (CNN) embedding model. Our CNN architecture is
fully parallelizable resulting in over 10x speedup in inference time over RNN
models. Parallelizable architecture enables to train deeper models where each
successive layer has increasingly larger receptive field and models longer
range semantic structure within the document. We additionally propose a fully
unsupervised learning algorithm to train this model based on stochastic forward
prediction. Empirical results on two public benchmarks show that our approach
produces comparable to state-of-the-art accuracy at a fraction of computational
cost.Comment: Major revision with additional experiments and model descriptio
Doc2Im: document to image conversion through self-attentive embedding
Text classification is a fundamental task in NLP applications. Latest
research in this field has largely been divided into two major sub-fields.
Learning representations is one sub-field and learning deeper models, both
sequential and convolutional, which again connects back to the representation
is the other side. We posit the idea that the stronger the representation is,
the simpler classifier models are needed to achieve higher performance. In this
paper we propose a completely novel direction to text classification research,
wherein we convert text to a representation very similar to images, such that
any deep network able to handle images is equally able to handle text. We take
a deeper look at the representation of documents as an image and subsequently
utilize very simple convolution based models taken as is from computer vision
domain. This image can be cropped, re-scaled, re-sampled and augmented just
like any other image to work with most of the state-of-the-art large
convolution based models which have been designed to handle large image
datasets. We show impressive results with some of the latest benchmarks in the
related fields. We perform transfer learning experiments, both from text to
text domain and also from image to text domain. We believe this is a paradigm
shift from the way document understanding and text classification has been
traditionally done, and will drive numerous novel research ideas in the
community
Predicting Abnormal Returns From News Using Text Classification
We show how text from news articles can be used to predict intraday price
movements of financial assets using support vector machines. Multiple kernel
learning is used to combine equity returns with text as predictive features to
increase classification performance and we develop an analytic center cutting
plane method to solve the kernel learning problem efficiently. We observe that
while the direction of returns is not predictable using either text or returns,
their size is, with text features producing significantly better performance
than historical returns alone.Comment: Larger data sets, results on time of day effect, and use of delta
hedged covered call options to trade on daily prediction
Scalable graph-based individual named entity identification
Named entity discovery (NED) is an important information retrieval problem
that can be decomposed into two sub-problems. The first sub-problem, named
entity recognition (NER), aims to tag pre-defined sets of words in a vocabulary
(called "named entities": names, places, locations, ...) when they appear in
natural language. The second subproblem, named entity linking/identification
(NEL), considers these entity mentions as queries to be identified in a
pre-existing database. In this paper, we consider the NEL problem, and assume a
set of queries (or mentions) that have to be identified within a knowledge
base. This knowledge base is represented by a text database paired with a
semantic graph. We present state-of-the-art methods in NEL, and propose a
2-step method for individual identification of named entities. Our approach is
well-motivated by the limitations brought by recent deep learning approaches
that lack interpratability, and require lots of parameter tuning along with
large volume of annotated data.
First of all, we propose a filtering algorithm designed with information
retrieval and text mining techniques, aiming to maximize precision at K
(typically for 5 <= K <=20). Then, we introduce two graph-based methods for
named entity identification to maximize precision at 1 by re-ranking the
remaining top entity candidates. The first identification method is using
parametrized graph mining, and the second similarity with graph kernels. Our
approach capitalizes on a fine-grained classification of entities from
annotated web data. We present our algorithms in details, and show
experimentally on standard datasets (NIST TAC-KBP, CONLL/AIDA) their
performance in terms of precision are better than any graph-based method
reported, and competitive with state-of-the-art systems. Finally, we conclude
on the advantages of our graph-based approach compared to recent deep learning
methods
- …