Search CORE

8,743 research outputs found

Improving the evaluation of web search systems

Author: Gurrin Cathal
Smeaton Alan F.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Linkage analysis as an aid to web search has been assumed to be of significant benefit and we know that it is being implemented by many major Search Engines. Why then have few TREC participants been able to scientifically prove the benefits of linkage analysis over the past three years? In this paper we put forward reasons why disappointing results have been found and we identify the linkage density requirements of a dataset to faithfully support experiments into linkage analysis. We also report a series of linkage-based retrieval experiments on a more densely linked dataset culled from the TREC web documents

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Data-based fault detection in chemical processes: Managing records with operator intervention and uncertain labels

Author: Askarian Mahdieh
Benítez Iglesias Raúl
Graells Sobré Moisès
Zarghami Reza
Publication venue: 'Elsevier BV'
Publication date: 23/06/2016
Field of study

Developing data-driven fault detection systems for chemical plants requires managing uncertain data labels and dynamic attributes due to operator-process interactions. Mislabeled data is a known problem in computer science that has received scarce attention from the process systems community. This work introduces and examines the effects of operator actions in records and labels, and the consequences in the development of detection models. Using a state space model, this work proposes an iterative relabeling scheme for retraining classifiers that continuously refines dynamic attributes and labels. Three case studies are presented: a reactor as a motivating example, flooding in a simulated de-Butanizer column, as a complex case, and foaming in an absorber as an industrial challenge. For the first case, detection accuracy is shown to increase by 14% while operating costs are reduced by 20%. Moreover, regarding the de-Butanizer column, the performance of the proposed strategy is shown to be 10% higher than the filtering strategy. Promising results are finally reported in regard of efficient strategies to deal with the presented problemPeer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Automatic extraction of quotes and topics from news feeds

Author: Luís Sarmento
Sérgio Nunes
Publication venue
Publication date: 01/01/2009
Field of study

The explosive growth in information production poses increasing challenges to consumers, confronted with problems often described as information overow. We present verbatim, a software system that can be used as a personal information butler to help structure and lter information. We address a small part of the information landscape, namely quotes extraction from portuguese news. This problem includes several challenges, specically in the areas of information extraction and topic distillation. We present a full description of the problems and our adopted approach. verbatim is available online at http://irlab.fe.up.pt/p/verbatim

Repositório Aberto da Universidade do Porto

Aggregating Content and Network Information to Curate Twitter User Lists

Author: Cunningham Pádraig
Greene Derek
Sheridan Gavin
Smyth Barry
Publication venue
Publication date: 25/06/2012
Field of study

Twitter introduced user lists in late 2009, allowing users to be grouped according to meaningful topics or themes. Lists have since been adopted by media outlets as a means of organising content around news stories. Thus the curation of these lists is important - they should contain the key information gatekeepers and present a balanced perspective on a story. Here we address this list curation process from a recommender systems perspective. We propose a variety of criteria for generating user list recommendations, based on content analysis, network analysis, and the "crowdsourcing" of existing user lists. We demonstrate that these types of criteria are often only successful for datasets with certain characteristics. To resolve this issue, we propose the aggregation of these different "views" of a news story on Twitter to produce more accurate user recommendations to support the curation process

arXiv.org e-Print Archive

Research Repository UCD

Irish Universities