1,730 research outputs found
Explicit diversification of event aspects for temporal summarization
During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness
BPMN task instance streaming for efficient micro-task crowdsourcing processes
The Business Process Model and Notation (BPMN) is a standard for modeling and executing business processes with human or machine tasks. The semantics of tasks is usually discrete: a task has exactly one start event and one end event; for multi-instance tasks, all instances must complete before an end event is emitted. We propose a new task type and streaming connector for crowdsourcing able to run hundreds or thousands of micro-task instances in parallel. The two constructs provide for task streaming semantics that is new to BPMN, enable the modeling and efficient enactment of complex crowdsourcing scenarios, and are applicable also beyond the special case of crowdsourcing. We implement the necessary design and runtime support on top of Crowd- Flower, demonstrate the viability of the approach via a case study, and report on a set of runtime performance experiments
A Citizen Science Approach for Analyzing Social Media With Crowdsourcing
Social media have the potential to provide timely information about emergency situations and sudden events. However, finding relevant information among the millions of posts being added every day can be difficult, and in current approaches developing an automatic data analysis project requires time and technical skills. This work presents a new approach for the analysis of social media posts, based on configurable automatic classification combined with Citizen Science methodologies. The process is facilitated by a set of flexible, automatic and open-source data processing tools called the Citizen Science Solution Kit. The kit provides a comprehensive set of tools that can be used and personalized in different situations, particularly during natural emergencies, starting from images and text contained in the posts. The tools can be employed by citizen scientists for filtering, classifying, and geolocating the content with a human-in-the-loop approach to support the data analyst, including feedback and suggestions on how to configure the automated tools, and techniques to gather inputs from citizens. Using flooding scenario as a guiding example, this paper illustrates the structure and functioning of the different tools proposed to support citizens scientists in their projects, and a methodological approach to their use. The process is then validated by discussing three case studies based on the Albania earthquake of 2019, the Covid-19 pandemic, and the Thailand floods of 2021. The results suggest that a flexible approach to tools composition and configuration can support a timely setup of an analysis project by citizen scientists, especially in case of emergencies in unexpected locations.ISSN:2169-353
How reliable are online speech intelligibility studies with known listener cohorts?
Although the use of nontraditional settings for speech perception experiments is growing, there have been few controlled comparisons of online and laboratory modalities in the context of speech intelligibility. The current study compares outcomes from three web-based replications of recent laboratory studies involving distorted, masked, fil- tered, and enhanced speech, amounting to 40 separate conditions. Rather than relying on unrestricted crowdsourcing, this study made use of participants from the population that would normally volunteer to take part physically in labo- ratory experiments. In sentence transcription tasks, the web cohort produced intelligibility scores 3–6 percentage points lower than their laboratory counterparts, and test modality interacted with experimental condition. These disparities and interactions largely disappeared after the exclusion of those web listeners who self-reported the use of low quality headphones, and the remaining listener cohort was also able to replicate key outcomes of each of the three laboratory studies. The laboratory and web modalities produced similar measures of experimental efficiency based on listener variability, response errors, and outlier counts. These findings suggest that the combination of known listener cohorts and moderate headphone quality provides a feasible alternative to traditional laboratory intel- ligibility studies.Basque Government Consolidados programme under Grant No. IT311-1
- …