49,783 research outputs found
A Combined CNN and LSTM Model for Arabic Sentiment Analysis
Deep neural networks have shown good data modelling capabilities when dealing
with challenging and large datasets from a wide range of application areas.
Convolutional Neural Networks (CNNs) offer advantages in selecting good
features and Long Short-Term Memory (LSTM) networks have proven good abilities
of learning sequential data. Both approaches have been reported to provide
improved results in areas such image processing, voice recognition, language
translation and other Natural Language Processing (NLP) tasks. Sentiment
classification for short text messages from Twitter is a challenging task, and
the complexity increases for Arabic language sentiment classification tasks
because Arabic is a rich language in morphology. In addition, the availability
of accurate pre-processing tools for Arabic is another current limitation,
along with limited research available in this area. In this paper, we
investigate the benefits of integrating CNNs and LSTMs and report obtained
improved accuracy for Arabic sentiment analysis on different datasets.
Additionally, we seek to consider the morphological diversity of particular
Arabic words by using different sentiment classification levels.Comment: Authors accepted version of submission for CD-MAKE 201
Reading Scene Text in Deep Convolutional Sequences
We develop a Deep-Text Recurrent Network (DTRN) that regards scene text
reading as a sequence labelling problem. We leverage recent advances of deep
convolutional neural networks to generate an ordered high-level sequence from a
whole word image, avoiding the difficult character segmentation problem. Then a
deep recurrent model, building on long short-term memory (LSTM), is developed
to robustly recognize the generated CNN sequences, departing from most existing
approaches recognising each character independently. Our model has a number of
appealing properties in comparison to existing scene text recognition methods:
(i) It can recognise highly ambiguous words by leveraging meaningful context
information, allowing it to work reliably without either pre- or
post-processing; (ii) the deep CNN feature is robust to various image
distortions; (iii) it retains the explicit order information in word image,
which is essential to discriminate word strings; (iv) the model does not depend
on pre-defined dictionary, and it can process unknown words and arbitrary
strings. Codes for the DTRN will be available.Comment: To appear in the 13th AAAI Conference on Artificial Intelligence
(AAAI-16), 201
Learning Convolutional Text Representations for Visual Question Answering
Visual question answering is a recently proposed artificial intelligence task
that requires a deep understanding of both images and texts. In deep learning,
images are typically modeled through convolutional neural networks, and texts
are typically modeled through recurrent neural networks. While the requirement
for modeling images is similar to traditional computer vision tasks, such as
object recognition and image classification, visual question answering raises a
different need for textual representation as compared to other natural language
processing tasks. In this work, we perform a detailed analysis on natural
language questions in visual question answering. Based on the analysis, we
propose to rely on convolutional neural networks for learning textual
representations. By exploring the various properties of convolutional neural
networks specialized for text data, such as width and depth, we present our
"CNN Inception + Gate" model. We show that our model improves question
representations and thus the overall accuracy of visual question answering
models. We also show that the text representation requirement in visual
question answering is more complicated and comprehensive than that in
conventional natural language processing tasks, making it a better task to
evaluate textual representation methods. Shallow models like fastText, which
can obtain comparable results with deep learning models in tasks like text
classification, are not suitable in visual question answering.Comment: Conference paper at SDM 2018. https://github.com/divelab/sva
- …