2,894 research outputs found
A classification-based approach to economic event detection in Dutch news text
Breaking news on economic events such as stock splits or mergers and acquisitions has been shown to have a substantial impact on the financial markets. As it is important to be able to automatically identify events in news items accurately and in a timely manner, we present in this paper proof-of-concept experiments for a supervised machine learning approach to economic event detection in newswire text. For this purpose, we created a corpus of Dutch financial news articles in which 10 types of company-specific economic events were annotated. We trained classifiers using various lexical, syntactic and semantic features. We obtain good results based on a basic set of shallow features, thus showing that this method is a viable approach for economic event detection in news text
The good, the bad and the implicit: a comprehensive approach to annotating explicit and implicit sentiment
We present a fine-grained scheme for the annotation of polar sentiment in text, that accounts for explicit sentiment (so-called private states), as well as implicit expressions of sentiment (polar facts). Polar expressions are annotated below sentence level and classified according to their subjectivity status. Additionally, they are linked to one or more targets with a specific polar orientation and intensity. Other components of the annotation scheme include source attribution and the identification and classification of expressions that modify polarity. In previous research, little attention has been given to implicit sentiment, which represents a substantial amount of the polar expressions encountered in our data. An English and Dutch corpus of financial newswire, consisting of over 45,000 words each, was annotated using our scheme. A subset of this corpus was used to conduct an inter-annotator agreement study, which demonstrated that the proposed scheme can be used to reliably annotate explicit and implicit sentiment in real-world textual data, making the created corpora a useful resource for sentiment analysis
Predicting Market Direction from Direct Speech by Business Leaders
Direct quotations from business leaders can communicate to the wider public the latent state of their organization as well as the beliefs of the organization\u27s leaders. Candid quotes from business leaders can have dramatic effects upon the share price of their organization. For example, Gerald Ratner in 1991 stated that his company\u27s products were crap and consequently his company (Ratners) lost in excess of 500 million pounds in market value. Information in quotes from business leaders can be used to make an estimation of the organization\u27s immediate future financial prospects and therefore can form part of a trading strategy. This paper describes a contextual classification strategy to label direct quotes from business leaders contained in news stories. The quotes are labelled as either: 1. positive, 2. negative or 3. neutral. A trading strategy aggregates the quote classifications to issue a buy, sell or hold instruction. The quote based trading strategy is evaluated against trading strategies which are based upon whole news story classification on the NASDAQ market index. The evaluation shows a clear advantage for the quote classification strategy over the competing strategies
Predictive Analysis on Twitter: Techniques and Applications
Predictive analysis of social media data has attracted considerable attention
from the research community as well as the business world because of the
essential and actionable information it can provide. Over the years, extensive
experimentation and analysis for insights have been carried out using Twitter
data in various domains such as healthcare, public health, politics, social
sciences, and demographics. In this chapter, we discuss techniques, approaches
and state-of-the-art applications of predictive analysis of Twitter data.
Specifically, we present fine-grained analysis involving aspects such as
sentiment, emotion, and the use of domain knowledge in the coarse-grained
analysis of Twitter data for making decisions and taking actions, and relate a
few success stories
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 Tweets
The free flow of information has been accelerated by the rapid development of
social media technology. There has been a significant social and psychological
impact on the population due to the outbreak of Coronavirus disease (COVID-19).
The COVID-19 pandemic is one of the current events being discussed on social
media platforms. In order to safeguard societies from this pandemic, studying
people's emotions on social media is crucial. As a result of their particular
characteristics, sentiment analysis of texts like tweets remains challenging.
Sentiment analysis is a powerful text analysis tool. It automatically detects
and analyzes opinions and emotions from unstructured data. Texts from a wide
range of sources are examined by a sentiment analysis tool, which extracts
meaning from them, including emails, surveys, reviews, social media posts, and
web articles. To evaluate sentiments, natural language processing (NLP) and
machine learning techniques are used, which assign weights to entities, topics,
themes, and categories in sentences or phrases. Machine learning tools learn
how to detect sentiment without human intervention by examining examples of
emotions in text. In a pandemic situation, analyzing social media texts to
uncover sentimental trends can be very helpful in gaining a better
understanding of society's needs and predicting future trends. We intend to
study society's perception of the COVID-19 pandemic through social media using
state-of-the-art BERT and Deep CNN models. The superiority of BERT models over
other deep models in sentiment analysis is evident and can be concluded from
the comparison of the various research studies mentioned in this article.Comment: 20 pages, 5 figure
Implicit emotion detection in text
In text, emotion can be expressed explicitly, using emotion-bearing words (e.g. happy, guilty) or implicitly without emotion-bearing words. Existing approaches focus on the detection of explicitly expressed emotion in text. However, there are various ways to express and convey emotions without the use of these emotion-bearing words. For example, given two sentences: “The outcome of my exam makes me happy” and “I passed my exam”, both sentences express happiness, with the first expressing it explicitly and the other implying it. In this thesis, we investigate implicit emotion detection in text. We propose a rule-based approach for implicit emotion detection, which can be used without labeled corpora for training. Our results show that our approach outperforms the lexicon matching method consistently and gives competitive performance in comparison to supervised classifiers. Given that emotions such as guilt and admiration which often require the identification of blameworthiness and praiseworthiness, we also propose an approach for the detection of blame and praise in text, using an adapted psychology model, Path model to blame. Lack of benchmarking dataset led us to construct a corpus containing comments of individuals’ emotional experiences annotated as blame, praise or others. Since implicit emotion detection might be useful for conflict-of-interest (CoI) detection in Wikipedia articles, we built a CoI corpus and explored various features including linguistic and stylometric, presentation, bias and emotion features. Our results show that emotion features are important when using Nave Bayes, but the best performance is obtained with SVM on linguistic and stylometric features only. Overall, we show that a rule-based approach can be used to detect implicit emotion in the absence of labelled data; it is feasible to adopt the psychology path model to blame for blame/praise detection from text, and implicit emotion detection is beneficial for CoI detection in Wikipedia articles
FINE-GRAINED EMOTION DETECTION IN MICROBLOG TEXT
Automatic emotion detection in text is concerned with using natural language processing techniques to recognize emotions expressed in written discourse. Endowing computers with the ability to recognize emotions in a particular kind of text, microblogs, has important applications in sentiment analysis and affective computing. In order to build computational models that can recognize the emotions represented in tweets we need to identify a set of suitable emotion categories. Prior work has mainly focused on building computational models for only a small set of six basic emotions (happiness, sadness, fear, anger, disgust, and surprise). This thesis describes a taxonomy of 28 emotion categories, an expansion of these six basic emotions, developed inductively from data. This set of 28 emotion categories represents a set of fine-grained emotion categories that are representative of the range of emotions expressed in tweets, microblog posts on Twitter.
The ability of humans to recognize these fine-grained emotion categories is characterized using inter-annotator reliability measures based on annotations provided by expert and novice annotators. A set of 15,553 human-annotated tweets form a gold standard corpus, EmoTweet-28. For each emotion category, we have extracted a set of linguistic cues (i.e., punctuation marks, emoticons, emojis, abbreviated forms, interjections, lemmas, hashtags and collocations) that can serve as salient indicators for that emotion category.
We evaluated the performance of automatic classification techniques on the set of 28 emotion categories through a series of experiments using several classifier and feature combinations. Our results shows that it is feasible to extend machine learning classification to fine-grained emotion detection in tweets (i.e., as many as 28 emotion categories) with results that are comparable to state-of-the-art classifiers that detect six to eight basic emotions in text. Classifiers using features extracted from the linguistic cues associated with each category equal or better the performance of conventional corpus-based and lexicon-based features for fine-grained emotion classification.
This thesis makes an important theoretical contribution in the development of a taxonomy of emotion in text. In addition, this research also makes several practical contributions, particularly in the creation of language resources (i.e., corpus and lexicon) and machine learning models for fine-grained emotion detection in text
- …