89 research outputs found
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
A coproduct structure on the formal affine Demazure algebra
In the present paper we generalize the coproduct structure on nil Hecke rings
introduced and studied by Kostant-Kumar to the context of an arbitrary
algebraic oriented cohomology theory and its associated formal group law. We
then construct an algebraic model of the T-equivariant oriented cohomology of
the variety of complete flags.Comment: 28 pages; minor revision of the previous versio
Comparison of Sentiment Analysis and User Ratings in Venue Recommendation
Venue recommendation aims to provide users with venues to visit, taking into account historical visits to venues. Many venue recommendation approaches make use of the provided users’ ratings to elicit the users’ preferences on the venues when making recommendations. In fact, many also consider the users’ ratings as the ground truth for assessing their recommendation performance. However, users are often reported to exhibit inconsistent rating behaviour, leading to less accurate preferences information being collected for the recommendation task. To alleviate this problem, we consider instead the use of the sentiment information collected from comments posted by the users on the venues as a surrogate to the users’ ratings. We experiment with various sentiment analysis classifiers, including the recent neural networks-based sentiment analysers, to examine the effectiveness of replacing users’ ratings with sentiment information. We integrate the sentiment information into the widely used matrix factorization and GeoSoCa multi feature-based venue recommendation models, thereby replacing the users’ ratings with the obtained sentiment scores. Our results, using three Yelp Challenge-based datasets, show that it is indeed possible to effectively replace users’ ratings with sentiment scores when state-of-the-art sentiment classifiers are used. Our findings show that the sentiment scores can provide accurate user preferences information, thereby increasing the prediction accuracy. In addition, our results suggest that a simple binary rating with ‘like’ and ‘dislike’ is a sufficient substitute of the current used multi-rating scales for venue recommendation in location-based social networks
The Effects of Twitter Sentiment on Stock Price Returns
Social media are increasingly reflecting and influencing behavior of other
complex systems. In this paper we investigate the relations between a well-know
micro-blogging platform Twitter and financial markets. In particular, we
consider, in a period of 15 months, the Twitter volume and sentiment about the
30 stock companies that form the Dow Jones Industrial Average (DJIA) index. We
find a relatively low Pearson correlation and Granger causality between the
corresponding time series over the entire time period. However, we find a
significant dependence between the Twitter sentiment and abnormal returns
during the peaks of Twitter volume. This is valid not only for the expected
Twitter volume peaks (e.g., quarterly announcements), but also for peaks
corresponding to less obvious events. We formalize the procedure by adapting
the well-known "event study" from economics and finance to the analysis of
Twitter data. The procedure allows to automatically identify events as Twitter
volume peaks, to compute the prevailing sentiment (positive or negative)
expressed in tweets at these peaks, and finally to apply the "event study"
methodology to relate them to stock returns. We show that sentiment polarity of
Twitter peaks implies the direction of cumulative abnormal returns. The amount
of cumulative abnormal returns is relatively low (about 1-2%), but the
dependence is statistically significant for several days after the events
ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials
Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols
Benchmarking Ontologies: Bigger or Better?
A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1) four of the most common medical ontologies with respect to a corpus of medical documents and (2) seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them
- …