190 research outputs found
A Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect
similar sentences written in natural language is crucial for several
applications, such as text mining, text summarization, plagiarism detection,
authorship authentication and question answering. Given two sentences, the
objective is to detect whether they are semantically identical. An important
insight from this work is that existing paraphrase systems perform well when
applied on clean texts, but they do not necessarily deliver good performance
against noisy texts. Challenges with paraphrase detection on user generated
short texts, such as Twitter, include language irregularity and noise. To cope
with these challenges, we propose a novel deep neural network-based approach
that relies on coarse-grained sentence modeling using a convolutional neural
network and a long short-term memory model, combined with a specific
fine-grained word-level similarity matching model. Our experimental results
show that the proposed approach outperforms existing state-of-the-art
approaches on user-generated noisy social media data, such as Twitter texts,
and achieves highly competitive performance on a cleaner corpus
Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning
Deep compositional models of meaning acting on distributional representations
of words in order to produce vectors of larger text constituents are evolving
to a popular area of NLP research. We detail a compositional distributional
framework based on a rich form of word embeddings that aims at facilitating the
interactions between words in the context of a sentence. Embeddings and
composition layers are jointly learned against a generic objective that
enhances the vectors with syntactic information from the surrounding context.
Furthermore, each word is associated with a number of senses, the most
plausible of which is selected dynamically during the composition process. We
evaluate the produced vectors qualitatively and quantitatively with positive
results. At the sentence level, the effectiveness of the framework is
demonstrated on the MSRPar task, for which we report results within the
state-of-the-art range.Comment: Accepted for presentation at EMNLP 201
A Deep Relevance Matching Model for Ad-hoc Retrieval
In recent years, deep neural networks have led to exciting breakthroughs in
speech recognition, computer vision, and natural language processing (NLP)
tasks. However, there have been few positive results of deep models on ad-hoc
retrieval tasks. This is partially due to the fact that many important
characteristics of the ad-hoc retrieval task have not been well addressed in
deep models yet. Typically, the ad-hoc retrieval task is formalized as a
matching problem between two pieces of text in existing work using deep models,
and treated equivalent to many NLP tasks such as paraphrase identification,
question answering and automatic conversation. However, we argue that the
ad-hoc retrieval task is mainly about relevance matching while most NLP
matching tasks concern semantic matching, and there are some fundamental
differences between these two matching tasks. Successful relevance matching
requires proper handling of the exact matching signals, query term importance,
and diverse matching requirements. In this paper, we propose a novel deep
relevance matching model (DRMM) for ad-hoc retrieval. Specifically, our model
employs a joint deep architecture at the query term level for relevance
matching. By using matching histogram mapping, a feed forward matching network,
and a term gating network, we can effectively deal with the three relevance
matching factors mentioned above. Experimental results on two representative
benchmark collections show that our model can significantly outperform some
well-known retrieval models as well as state-of-the-art deep matching models.Comment: CIKM 2016, long pape
FVEC feature and Machine Learning Approach for Indonesian Opinion Mining on YouTube Comments
Mining opinions from Indonesian comments from YouTube videos are required to extract interesting patterns and valuable information from consumer feedback. Opinions can consist of a combination of sentiments and topics from comments. The features considered in the mining of opinion become one of the important keys to getting a quality opinion. This paper proposes to utilize FVEC and TF-IDF features to represent the comments. In addition, two popular machine learning approaches in the field of opinion mining, i.e., SVM and CNN, are explored separately to extract opinions in Indonesian comments of YouTube videos. The experimental results show that the use of FVEC features on SVM and CNN achieves a very significant effect on the quality of opinions obtained, in term of accuracy
Convolutional Neural Networks over Tree Structures for Programming Language Processing
Programming language processing (similar to natural language processing) is a
hot research topic in the field of software engineering; it has also aroused
growing interest in the artificial intelligence community. However, different
from a natural language sentence, a program contains rich, explicit, and
complicated structural information. Hence, traditional NLP models may be
inappropriate for programs. In this paper, we propose a novel tree-based
convolutional neural network (TBCNN) for programming language processing, in
which a convolution kernel is designed over programs' abstract syntax trees to
capture structural information. TBCNN is a generic architecture for programming
language processing; our experiments show its effectiveness in two different
program analysis tasks: classifying programs according to functionality, and
detecting code snippets of certain patterns. TBCNN outperforms baseline
methods, including several neural models for NLP.Comment: Accepted at AAAI-1
- …