Search CORE

369 research outputs found

Explicit diversification of event aspects for temporal summarization

Author: Macdonald Craig
McCreadie Richard
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2018
Field of study

During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness

Enlighten

Recommended from our members

A collaborative approach to IR evaluation

Author: Sheshadri Aashish
Publication venue
Publication date: 16/09/2014
Field of study

textIn this thesis we investigate two main problems: 1) inferring consensus from disparate inputs to improve quality of crowd contributed data; and 2) developing a reliable crowd-aided IR evaluation framework. With regard to the first contribution, while many statistical label aggregation methods have been proposed, little comparative benchmarking has occurred in the community making it difficult to determine the state-of-the-art in consensus or to quantify novelty and progress, leaving modern systems to adopt simple control strategies. To aid the progress of statistical consensus and make state-of-the-art methods accessible, we develop a benchmarking framework in SQUARE, an open source shared task framework including benchmark datasets, defined tasks, standard metrics, and reference implementations with empirical results for several popular methods. Through the development of SQUARE we propose a crowd simulation model that emulates real crowd environments to enable rapid and reliable experimentation of collaborative methods with different crowd contributions. We apply the findings of the benchmark to develop reliable crowd contributed test collections for IR evaluation. As our second contribution, we describe a collaborative model for distributing relevance judging tasks between trusted assessors and crowd judges. Based on prior work's hypothesis of judging disagreements on borderline documents, we train a logistic regression model to predict assessor disagreement, prioritizing judging tasks by expected disagreement. Judgments are generated from different crowd models and intelligently aggregated. Given a priority queue, a judging budget, and a ratio for expert vs. crowd judging costs, critical judging tasks are assigned to trusted assessors with the crowd supplying remaining judgments. Results on two TREC datasets show significant judging burden can be confidently shifted to the crowd, achieving high rank correlation and often at lower cost vs. exclusive use of trusted assessors.Computer Science

Texas ScholarWorks

Understanding Events:A Diversity-driven Human-Machine Approach

Author: Inel Oana
Publication venue
Publication date: 09/03/2022
Field of study

VU Research Portal

PACRR: A Position-Aware Neural IR Model for Relevance Matching

Author: Berberich Klaus
de Melo Gerard
Hui Kai
Yates Andrew
Publication venue
Publication date: 01/01/2017
Field of study

In order to adopt deep learning for information retrieval, models are needed that can capture all relevant information required to assess the relevance of a document to a given user query. While previous works have successfully captured unigram term matches, how to fully employ position-dependent information such as proximity and term dependencies has been insufficiently explored. In this work, we propose a novel neural IR model named PACRR aiming at better modeling position-dependent interactions between a query and a document. Extensive experiments on six years' TREC Web Track data confirm that the proposed model yields better results under multiple benchmarks.Comment: To appear in EMNLP201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Ranking for Web Data Search Using On-The-Fly Data Integration

Author: Herzig Daniel Markus
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Ranking - the algorithmic decision on how relevant an information artifact is for a given information need and the sorting of artifacts by their concluded relevancy - is an integral part of every search engine. In this book we investigate how structured Web data can be leveraged for ranking with the goal to improve the effectiveness of search. We propose new solutions for ranking using on-the-fly data integration and experimentally analyze and evaluate them against the latest baselines

Directory of Open Access Books (DOAB)

Bayesian Methods for Intelligent Task Assignment in Crowdsourcing Systems

Author: Roberts Stephen
Simpson Edwin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2015
Field of study

Explore Bristol Research

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Standing in Your Shoes: External Assessments for Personalized Recommender Systems

Author: de Rijke M.
Liu Y.
Lu H.
Ma S.
Ma W.
Zhang M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications