536 research outputs found
A Topic Recommender for Journalists
The way in which people acquire information on events and form their own
opinion on them has changed dramatically with the advent of social media. For many
readers, the news gathered from online sources become an opportunity to share points
of view and information within micro-blogging platforms such as Twitter, mainly
aimed at satisfying their communication needs. Furthermore, the need to deepen the
aspects related to news stimulates a demand for additional information which is often
met through online encyclopedias, such as Wikipedia. This behaviour has also
influenced the way in which journalists write their articles, requiring a careful assessment
of what actually interests the readers. The goal of this paper is to present
a recommender system, What to Write and Why, capable of suggesting to a journalist,
for a given event, the aspects still uncovered in news articles on which the
readers focus their interest. The basic idea is to characterize an event according to
the echo it receives in online news sources and associate it with the corresponding
readers’ communicative and informative patterns, detected through the analysis of
Twitter and Wikipedia, respectively. Our methodology temporally aligns the results
of this analysis and recommends the concepts that emerge as topics of interest from
Twitter and Wikipedia, either not covered or poorly covered in the published news
articles
Document Filtering for Long-tail Entities
Filtering relevant documents with respect to entities is an essential task in
the context of knowledge base construction and maintenance. It entails
processing a time-ordered stream of documents that might be relevant to an
entity in order to select only those that contain vital information.
State-of-the-art approaches to document filtering for popular entities are
entity-dependent: they rely on and are also trained on the specifics of
differentiating features for each specific entity. Moreover, these approaches
tend to use so-called extrinsic information such as Wikipedia page views and
related entities which is typically only available only for popular head
entities. Entity-dependent approaches based on such signals are therefore
ill-suited as filtering methods for long-tail entities. In this paper we
propose a document filtering method for long-tail entities that is
entity-independent and thus also generalizes to unseen or rarely seen entities.
It is based on intrinsic features, i.e., features that are derived from the
documents in which the entities are mentioned. We propose a set of features
that capture informativeness, entity-saliency, and timeliness. In particular,
we introduce features based on entity aspect similarities, relation patterns,
and temporal expressions and combine these with standard features for document
filtering. Experiments following the TREC KBA 2014 setup on a publicly
available dataset show that our model is able to improve the filtering
performance for long-tail entities over several baselines. Results of applying
the model to unseen entities are promising, indicating that the model is able
to learn the general characteristics of a vital document. The overall
performance across all entities---i.e., not just long-tail entities---improves
upon the state-of-the-art without depending on any entity-specific training
data.Comment: CIKM2016, Proceedings of the 25th ACM International Conference on
Information and Knowledge Management. 201
Entity-Oriented Search
This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
Towards Ontological Support for Journalistic Angles
Journalism relies more and more on information and communication technology (ICT). New journalistic ICT platforms continuously harvest potentially news-related information from the internet and try to make it useful for journalists. Because the information sources and formats vary widely, knowledge graphs are emerging as a preferred technology for integrating, enriching, and preparing journalistic information. The paper explores how journalistic knowledge graphs can be augmented with support for news angles, in order to help journalists detect news-worthy events and present them in ways that will interest the intended audience. We argue that finding newsworthy angles on news-related information is important as an example of a more general problem in information science: that of finding the most interesting events and situations in big data sets and presenting those events and situations in the most interesting ways.acceptedVersio
- …