1,360 research outputs found
Text Understanding from Scratch
This article demontrates that we can apply deep learning to text
understanding from character-level inputs all the way up to abstract text
concepts, using temporal convolutional networks (ConvNets). We apply ConvNets
to various large-scale datasets, including ontology classification, sentiment
analysis, and text categorization. We show that temporal ConvNets can achieve
astonishing performance without the knowledge of words, phrases, sentences and
any other syntactic or semantic structures with regards to a human language.
Evidence shows that our models can work for both English and Chinese.Comment: This technical report is superseded by a paper entitled
"Character-level Convolutional Networks for Text Classification",
arXiv:1509.01626. It has considerably more experimental results and a
rewritten introductio
An Empirical Study on Sentiment Classification of Chinese Review using Word Embedding
In this article, how word embeddings can be used as features in Chinese
sentiment classification is presented. Firstly, a Chinese opinion corpus is
built with a million comments from hotel review websites. Then the word
embeddings which represent each comment are used as input in different machine
learning methods for sentiment classification, including SVM, Logistic
Regression, Convolutional Neural Network (CNN) and ensemble methods. These
methods get better performance compared with N-gram models using Naive Bayes
(NB) and Maximum Entropy (ME). Finally, a combination of machine learning
methods is proposed which presents an outstanding performance in precision,
recall and F1 score. After selecting the most useful methods to construct the
combinational model and testing over the corpus, the final F1 score is 0.920.Comment: The 29th Pacific Asia Conference on Language, Information and
Computin
Character-level Convolutional Networks for Text Classification
This article offers an empirical exploration on the use of character-level
convolutional networks (ConvNets) for text classification. We constructed
several large-scale datasets to show that character-level convolutional
networks could achieve state-of-the-art or competitive results. Comparisons are
offered against traditional models such as bag of words, n-grams and their
TFIDF variants, and deep learning models such as word-based ConvNets and
recurrent neural networks.Comment: An early version of this work entitled "Text Understanding from
Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has
considerably more experimental results and a rewritten introduction, Advances
in Neural Information Processing Systems 28 (NIPS 2015
Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval
Cross-modal information retrieval aims to find heterogeneous data of various
modalities from a given query of one modality. The main challenge is to map
different modalities into a common semantic space, in which distance between
concepts in different modalities can be well modeled. For cross-modal
information retrieval between images and texts, existing work mostly uses
off-the-shelf Convolutional Neural Network (CNN) for image feature extraction.
For texts, word-level features such as bag-of-words or word2vec are employed to
build deep learning models to represent texts. Besides word-level semantics,
the semantic relations between words are also informative but less explored. In
this paper, we model texts by graphs using similarity measure based on
word2vec. A dual-path neural network model is proposed for couple feature
learning in cross-modal information retrieval. One path utilizes Graph
Convolutional Network (GCN) for text modeling based on graph representations.
The other path uses a neural network with layers of nonlinearities for image
modeling based on off-the-shelf features. The model is trained by a pairwise
similarity loss function to maximize the similarity of relevant text-image
pairs and minimize the similarity of irrelevant pairs. Experimental results
show that the proposed model outperforms the state-of-the-art methods
significantly, with 17% improvement on accuracy for the best case.Comment: 7 pages, 11 figure
Towards Accurate Deceptive Opinion Spam Detection based on Word Order-preserving CNN
Nowadays, deep learning has been widely used. In natural language learning,
the analysis of complex semantics has been achieved because of its high degree
of flexibility. The deceptive opinions detection is an important application
area in deep learning model, and related mechanisms have been given attention
and researched. On-line opinions are quite short, varied types and content. In
order to effectively identify deceptive opinions, we need to comprehensively
study the characteristics of deceptive opinions, and explore novel
characteristics besides the textual semantics and emotional polarity that have
been widely used in text analysis. The detection mechanism based on deep
learning has better self-adaptability and can effectively identify all kinds of
deceptive opinions. In this paper, we optimize the convolution neural network
model by embedding the word order characteristics in its convolution layer and
pooling layer, which makes convolution neural network more suitable for various
text classification and deceptive opinions detection. The TensorFlow-based
experiments demonstrate that the detection mechanism proposed in this paper
achieve more accurate deceptive opinion detection results
A Deep Learning Approach for Expert Identification in Question Answering Communities
In this paper, we describe an effective convolutional neural network
framework for identifying the expert in question answering community. This
approach uses the convolutional neural network and combines user feature
representations with question feature representations to compute scores that
the user who gets the highest score is the expert on this question. Unlike
prior work, this method does not measure expert based on measure answer content
quality to identify the expert but only require question sentence and user
embedding feature to identify the expert. Remarkably, Our model can be applied
to different languages and different domains. The proposed framework is trained
on two datasets, The first dataset is Stack Overflow and the second one is
Zhihu. The Top-1 accuracy results of our experiments show that our framework
outperforms the best baseline framework for expert identification.Comment: 7 pages. arXiv admin note: text overlap with arXiv:1403.6652 by other
author
Multi-task Learning for Chinese Word Usage Errors Detection
Chinese word usage errors often occur in non-native Chinese learners'
writing. It is very helpful for non-native Chinese learners to detect them
automatically when learning writing. In this paper, we propose a novel
approach, which takes advantages of different auxiliary tasks, such as
POS-tagging prediction and word log frequency prediction, to help the task of
Chinese word usage error detection. With the help of these auxiliary tasks, we
achieve the state-of-the-art results on the performances on the HSK corpus
data, without any other extra data.Comment: 4 pages, 2 figures, 1 table, has been accepted as a conference paper
of the 3rd IEEE International Conference on Computational Intelligence and
Applications (ICCIA 2018
Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation
Character-based sequence labeling framework is flexible and efficient for
Chinese word segmentation (CWS). Recently, many character-based neural models
have been applied to CWS. While they obtain good performance, they have two
obvious weaknesses. The first is that they heavily rely on manually designed
bigram feature, i.e. they are not good at capturing n-gram features
automatically. The second is that they make no use of full word information.
For the first weakness, we propose a convolutional neural model, which is able
to capture rich n-gram features without any feature engineering. For the second
one, we propose an effective approach to integrate the proposed model with word
embeddings. We evaluate the model on two benchmark datasets: PKU and MSR.
Without any feature engineering, the model obtains competitive performance --
95.7% on PKU and 97.3% on MSR. Armed with word embeddings, the model achieves
state-of-the-art performance on both datasets -- 96.5% on PKU and 98.0% on MSR,
without using any external labeled resource.Comment: will be published by IJCNLP201
DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging
Tagging news articles or blog posts with relevant tags from a collection of
predefined ones is coined as document tagging in this work. Accurate tagging of
articles can benefit several downstream applications such as recommendation and
search. In this work, we propose a novel yet simple approach called DocTag2Vec
to accomplish this task. We substantially extend Word2Vec and Doc2Vec---two
popular models for learning distributed representation of words and documents.
In DocTag2Vec, we simultaneously learn the representation of words, documents,
and tags in a joint vector space during training, and employ the simple
-nearest neighbor search to predict tags for unseen documents. In contrast
to previous multi-label learning methods, DocTag2Vec directly deals with raw
text instead of provided feature vector, and in addition, enjoys advantages
like the learning of tag representation, and the ability of handling newly
created tags. To demonstrate the effectiveness of our approach, we conduct
experiments on several datasets and show promising results against
state-of-the-art methods.Comment: 10 page
A Question Answering Approach to Emotion Cause Extraction
Emotion cause extraction aims to identify the reasons behind a certain
emotion expressed in text. It is a much more difficult task compared to emotion
classification. Inspired by recent advances in using deep memory networks for
question answering (QA), we propose a new approach which considers emotion
cause identification as a reading comprehension task in QA. Inspired by
convolutional neural networks, we propose a new mechanism to store relevant
context in different memory slots to model context information. Our proposed
approach can extract both word level sequence features and lexical features.
Performance evaluation shows that our method achieves the state-of-the-art
performance on a recently released emotion cause dataset, outperforming a
number of competitive baselines by at least 3.01% in F-measure.Comment: Accepted by EMNLP 201
- …