1,973 research outputs found
Numeral Understanding in Financial Tweets for Fine-grained Crowd-based Forecasting
Numerals that contain much information in financial documents are crucial for
financial decision making. They play different roles in financial analysis
processes. This paper is aimed at understanding the meanings of numerals in
financial tweets for fine-grained crowd-based forecasting. We propose a
taxonomy that classifies the numerals in financial tweets into 7 categories,
and further extend some of these categories into several subcategories. Neural
network-based models with word and character-level encoders are proposed for
7-way classification and 17-way classification. We perform backtest to confirm
the effectiveness of the numeric opinions made by the crowd. This work is the
first attempt to understand numerals in financial social media data, and we
provide the first comparison of fine-grained opinion of individual investors
and analysts based on their forecast price. The numeral corpus used in our
experiments, called FinNum 1.0 , is available for research purposes.Comment: Accepted by the 2018 IEEE/WIC/ACM International Conference on Web
Intelligence (WI 2018), Santiago, Chil
Multilingual Cross-domain Perspectives on Online Hate Speech
In this report, we present a study of eight corpora of online hate speech, by
demonstrating the NLP techniques that we used to collect and analyze the
jihadist, extremist, racist, and sexist content. Analysis of the multilingual
corpora shows that the different contexts share certain characteristics in
their hateful rhetoric. To expose the main features, we have focused on text
classification, text profiling, keyword and collocation extraction, along with
manual annotation and qualitative study.Comment: 24 page
- …