140,779 research outputs found
Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture
The World Wide Web holds a wealth of information in the form of unstructured
texts such as customer reviews for products, events and more. By extracting and
analyzing the expressed opinions in customer reviews in a fine-grained way,
valuable opportunities and insights for customers and businesses can be gained.
We propose a neural network based system to address the task of Aspect-Based
Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic
Sentiment Analysis. Our proposed architecture divides the task in two subtasks:
aspect term extraction and aspect-specific sentiment extraction. This approach
is flexible in that it allows to address each subtask independently. As a first
step, a recurrent neural network is used to extract aspects from a text by
framing the problem as a sequence labeling task. In a second step, a recurrent
network processes each extracted aspect with respect to its context and
predicts a sentiment label. The system uses pretrained semantic word embedding
features which we experimentally enhance with semantic knowledge extracted from
WordNet. Further features extracted from SenticNet prove to be beneficial for
the extraction of sentiment labels. As the best performing system in its
category, our proposed system proves to be an effective approach for the
Aspect-Based Sentiment Analysis
Weakly-supervised appraisal analysis
This article is concerned with the computational treatment of Appraisal, a Systemic Functional Linguistic theory of the types of language employed to communicate opinion in English. The theory considers aspects such as Attitude (how writers communicate their point of view), Engagement (how writers align themselves with respect to the opinions of others) and Graduation (how writers amplify or diminish their attitudes and engagements). To analyse text according to the theory we employ a weakly-supervised approach to text classification, which involves comparing the similarity of words with prototypical examples of classes. We evaluate the method's performance using a collection of book reviews annotated according to the Appraisal theory
AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis
Recently, sound recognition has been used to identify sounds, such as car and
river. However, sounds have nuances that may be better described by
adjective-noun pairs such as slow car, and verb-noun pairs such as flying
insects, which are under explored. Therefore, in this work we investigate the
relation between audio content and both adjective-noun pairs and verb-noun
pairs. Due to the lack of datasets with these kinds of annotations, we
collected and processed the AudioPairBank corpus consisting of a combined total
of 1,123 pairs and over 33,000 audio files. One contribution is the previously
unavailable documentation of the challenges and implications of collecting
audio recordings with these type of labels. A second contribution is to show
the degree of correlation between the audio content and the labels through
sound recognition experiments, which yielded results of 70% accuracy, hence
also providing a performance benchmark. The results and study in this paper
encourage further exploration of the nuances in audio and are meant to
complement similar research performed on images and text in multimedia
analysis.Comment: This paper is a revised version of "AudioSentibank: Large-scale
Semantic Ontology of Acoustic Concepts for Audio Content Analysis
Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm
Twitter is a popular social network platform where users can interact and
post texts of up to 280 characters called tweets. Hashtags, hyperlinked words
in tweets, have increasingly become crucial for tweet retrieval and search.
Using hashtags for tweet topic classification is a challenging problem because
of context dependent among words, slangs, abbreviation and emoticons in a short
tweet along with evolving use of hashtags. Since Twitter generates millions of
tweets daily, tweet analytics is a fundamental problem of Big data stream that
often requires a real-time Distributed processing. This paper proposes a
distributed online approach to tweet topic classification with hashtags. Being
implemented on Apache Storm, a distributed real time framework, our approach
incrementally identifies and updates a set of strong predictors in the Na\"ive
Bayes model for classifying each incoming tweet instance. Preliminary
experiments show promising results with up to 97% accuracy and 37% increase in
throughput on eight processors.Comment: IEEE International Conference on Big Data 201
Knowledge Graph semantic enhancement of input data for improving AI
Intelligent systems designed using machine learning algorithms require a
large number of labeled data. Background knowledge provides complementary, real
world factual information that can augment the limited labeled data to train a
machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for
many practical applications, it is convenient and useful to organize this
background knowledge in the form of a graph. Recent academic research and
implemented industrial intelligent systems have shown promising performance for
machine learning algorithms that combine training data with a knowledge graph.
In this article, we discuss the use of relevant KGs to enhance input data for
two applications that use machine learning -- recommendation and community
detection. The KG improves both accuracy and explainability
Volatility Prediction using Financial Disclosures Sentiments with Word Embedding-based IR Models
Volatility prediction--an essential concept in financial markets--has
recently been addressed using sentiment analysis methods. We investigate the
sentiment of annual disclosures of companies in stock markets to forecast
volatility. We specifically explore the use of recent Information Retrieval
(IR) term weighting models that are effectively extended by related terms using
word embeddings. In parallel to textual information, factual market data have
been widely used as the mainstream approach to forecast market risk. We
therefore study different fusion methods to combine text and market data
resources. Our word embedding-based approach significantly outperforms
state-of-the-art methods. In addition, we investigate the characteristics of
the reports of the companies in different financial sectors
- ā¦