513 research outputs found
A Study on Agreement in PICO Span Annotations
In evidence-based medicine, relevance of medical literature is determined by
predefined relevance conditions. The conditions are defined based on PICO
elements, namely, Patient, Intervention, Comparator, and Outcome. Hence, PICO
annotations in medical literature are essential for automatic relevant document
filtering. However, defining boundaries of text spans for PICO elements is not
straightforward. In this paper, we study the agreement of PICO annotations made
by multiple human annotators, including both experts and non-experts.
Agreements are estimated by a standard span agreement (i.e., matching both
labels and boundaries of text spans), and two types of relaxed span agreement
(i.e., matching labels without guaranteeing matching boundaries of spans).
Based on the analysis, we report two observations: (i) Boundaries of PICO span
annotations by individual human annotators are very diverse. (ii) Despite the
disagreement in span boundaries, general areas of the span annotations are
broadly agreed by annotators. Our results suggest that applying a standard
agreement alone may undermine the agreement of PICO spans, and adopting both a
standard and a relaxed agreements is more suitable for PICO span evaluation.Comment: Accepted in SIGIR 2019 (Short paper
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
When is Eaton's Markov chain irreducible?
Consider a parametric statistical model and an
improper prior distribution that together yield a
(proper) formal posterior distribution . The prior is
called strongly admissible if the generalized Bayes estimator of every bounded
function of is admissible under squared error loss. Eaton [Ann.
Statist. 20 (1992) 1147--1179] has shown that a sufficient condition for strong
admissibility of is the local recurrence of the Markov chain whose
transition function is . Applications of this result and its
extensions are often greatly simplified when the Markov chain associated with
is irreducible. However, establishing irreducibility can be difficult. In
this paper, we provide a characterization of irreducibility for general state
space Markov chains and use this characterization to develop an easily checked,
necessary and sufficient condition for irreducibility of Eaton's Markov chain.
All that is required to check this condition is a simple examination of and
. Application of the main result is illustrated using two examples.Comment: Published at http://dx.doi.org/10.3150/07-BEJ6191 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
From Counter-intuitive Observations to a Fresh Look at Recommender System
Recently, a few papers report counter-intuitive observations made from
experiments on recommender system (RecSys). One observation is that users who
spend more time and users who have many interactions with a recommendation
system receive poorer recommendations. Another observation is that models
trained by using only the more recent parts of a dataset show significant
performance improvement. In this opinion paper, we interpret these
counter-intuitive observations from two perspectives. First, the observations
are made with respect to the global timeline of user-item interactions. Second,
the observations are considered counter-intuitive because they contradict our
expectation on a recommender: the more interactions a user has, the higher
chance that the recommender better learns the user preference. For the first
perspective, we discuss the importance of the global timeline by using the
simplest baseline Popularity as a starting point. We answer two questions: (i)
why the simplest model popularity is often ill-defined in academic research?
and (ii) why the popularity baseline is evaluated in this way? The questions
lead to a detailed discussion on the data leakage issue in many offline
evaluations. As the result, model accuracies reported in many academic papers
are less meaningful and incomparable. For the second perspective, we try to
answer two more questions: (i) why models trained by using only the more recent
parts of data demonstrate better performance? and (ii) why more interactions
from users lead to poorer recommendations? The key to both questions is user
preference modeling. We then propose to have a fresh look at RecSys. We discuss
how to conduct more practical offline evaluations and possible ways to
effectively model user preferences. The discussion and opinions in this paper
are on top-N recommendation only, not on rating prediction.Comment: 11 pages, 5 figure
- …