76,911 research outputs found
TimeMachine: Timeline Generation for Knowledge-Base Entities
We present a method called TIMEMACHINE to generate a timeline of events and
relations for entities in a knowledge base. For example for an actor, such a
timeline should show the most important professional and personal milestones
and relationships such as works, awards, collaborations, and family
relationships. We develop three orthogonal timeline quality criteria that an
ideal timeline should satisfy: (1) it shows events that are relevant to the
entity; (2) it shows events that are temporally diverse, so they distribute
along the time axis, avoiding visual crowding and allowing for easy user
interaction, such as zooming in and out; and (3) it shows events that are
content diverse, so they contain many different types of events (e.g., for an
actor, it should show movies and marriages and awards, not just movies). We
present an algorithm to generate such timelines for a given time period and
screen size, based on submodular optimization and web-co-occurrence statistics
with provable performance guarantees. A series of user studies using Mechanical
Turk shows that all three quality criteria are crucial to produce quality
timelines and that our algorithm significantly outperforms various baseline and
state-of-the-art methods.Comment: To appear at ACM SIGKDD KDD'15. 12pp, 7 fig. With appendix. Demo and
other info available at http://cs.stanford.edu/~althoff/timemachine
EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets
This article introduces a new language-independent approach for creating a
large-scale high-quality test collection of tweets that supports multiple
information retrieval (IR) tasks without running a shared-task campaign. The
adopted approach (demonstrated over Arabic tweets) designs the collection
around significant (i.e., popular) events, which enables the development of
topics that represent frequent information needs of Twitter users for which
rich content exists. That inherently facilitates the support of multiple tasks
that generally revolve around events, namely event detection, ad-hoc search,
timeline generation, and real-time summarization. The key highlights of the
approach include diversifying the judgment pool via interactive search and
multiple manually-crafted queries per topic, collecting high-quality
annotations via crowd-workers for relevancy and in-house annotators for
novelty, filtering out low-agreement topics and inaccessible tweets, and
providing multiple subsets of the collection for better availability. Applying
our methodology on Arabic tweets resulted in EveTAR , the first
freely-available tweet test collection for multiple IR tasks. EveTAR includes a
crawl of 355M Arabic tweets and covers 50 significant events for which about
62K tweets were judged with substantial average inter-annotator agreement
(Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating
existing algorithms in the respective tasks. Results indicate that the new
collection can support reliable ranking of IR systems that is comparable to
similar TREC collections, while providing strong baseline results for future
studies over Arabic tweets
Enhanced information retrieval using domain-specific recommender models
The objective of an information retrieval (IR) system is to retrieve relevant items which meet a user information need. There is currently significant interest in personalized IR which seeks to improve IR effectiveness by incorporating a model of the user’s interests. However, in some situations
there may be no opportunity to learn about the interests of a specific user on a certain topic. In our work, we propose an IR approach which combines a recommender algorithm with IR methods to improve retrieval for domains where the system has no opportunity to learn prior information about the user’s knowledge of a domain for which they have not previously entered a query. We use search data from other previous users interested in the same topic to build a
recommender model for this topic. When a user enters a query on a topic, new to this user, an appropriate recommender model is selected and used to predict a ranking which the user may find interesting based on the behaviour of previous
users with similar queries. The recommender output is integrated with a standard IR method in a weighted linear combination to provide a final result for the user. Experiments using the INEX 2009 data collection with a simulated recommender training set show that our approach can improve on a baseline IR system
A pattern mining approach for information filtering systems
It is a big challenge to clearly identify the boundary between positive and negative streams for information filtering systems. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on the RCV1 data collection, and substantial experiments show that the proposed approach achieves encouraging performance and the performance is also consistent for adaptive filtering as well
Second Screen User Profiling and Multi-level Smart Recommendations in the context of Social TVs
In the context of Social TV, the increasing popularity of first and second
screen users, interacting and posting content online, illustrates new business
opportunities and related technical challenges, in order to enrich user
experience on such environments. SAM (Socializing Around Media) project uses
Social Media-connected infrastructure to deal with the aforementioned
challenges, providing intelligent user context management models and mechanisms
capturing social patterns, to apply collaborative filtering techniques and
personalized recommendations towards this direction. This paper presents the
Context Management mechanism of SAM, running in a Social TV environment to
provide smart recommendations for first and second screen content. Work
presented is evaluated using real movie rating dataset found online, to
validate the SAM's approach in terms of effectiveness as well as efficiency.Comment: In: Wu TT., Gennari R., Huang YM., Xie H., Cao Y. (eds) Emerging
Technologies for Education. SETE 201
- …