Search CORE

239 research outputs found

miraQA: Initial experiments in Question Answering

Author: García Serrano Ana
González Cristóbal José Carlos
Goñi Menoyo José Miguel
Martínez Fernández José Luis
Martínez Fernández Paloma
Pablo Sánchez César de
Villena Román Julio
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2004
Field of study

We present the miraQA system that constitutes MIRACLE first experience in Question Answering for monolingual Spanish and has been developed for QA@CLEF 2004. The architecture of the system is described and details of our approach to Statistical Answer Extraction based on Hidden Markov Models are presented. One run that uses last year question set for training purposes has been submitted. The results are presented together with ideas for improvement

Archivo Digital UPM

Evaluation campaigns and TRECVid

Author: Kraaij Wessel
Over Paul
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Improving search effectiveness in sentence retrieval and novelty detection

Author: Teijeira Fernández Ronald
Publication venue: Universidade de Santiago de Compostela. Servizo de Publicacións e Intercambio Científico
Publication date: 01/01/2011
Field of study

In this thesis we study thoroughly sentence retrieval and novelty detec- tion. We analyze the strengths and weaknesses of current state of the art methods and, subsequently, new mechanisms to address sentence retrieval and novelty detection are proposed. Retrieval and novelty detection are related tasks: usually, we initially apply a retrieval model that estimates properly the relevance of passages (e.g. sentences) and generates a ranking of passages sorted by their relevance. Next, this ranking is used as the input of a novelty detection module, which tries to filter out redundant passages in the ranking. The estimation of relevance at sentence level is di cult. Standard meth- ods used to estimate relevance are simply based on matching query and sentence terms. However, queries usually contain two or three terms and sentences are also short. Therefore, the matching between query and sen- tences is poor. In order to address this problem, we study how to enrich this process with additional information: the context. The context refers to the information provided by the surrounding sentences or the document where the sentence is located. Such context reduces ambiguity and supplies additional information not included in the sentence itself. Additionally, it is important to estimate how important (central) a sentence is within the docu- ment. These two components are studied following a formal framework based on Statistical Language Models. In this respect, we demonstrate that these components yield to improvements in current sentence retrieval methods. In this thesis we work with collections of sentences that were extracted from news. News not only explain facts but also express opinions that people have about a particular event or topic. Therefore, the proper estimation of which passages are opinionated may help to further improve the estimation of relevance for sentences. We apply a formal methodology that helps us to incorporate opinions into standard sentence retrieval methods. Additionally, we propose simple empirical alternatives to incorporate query-independent features into sentence retrieval models. We demonstrate that the incorpo- ration of opinions to estimate relevance is an important factor that makes sentence retrieval methods more effective. Along this study, we also analyze query-independent features based on sentence length and named entities. The combination of the context-based approach with the incorporation of opinion-based features is straightforward. We study how to combine these two approaches and its impact. We demonstrate that context-based models are implicitly promoting sentences with opinions and, therefore, opinion- based features do not help to further improve context-based methods. The second part of this thesis is dedicated to novelty detection at sentence level. Because novelty is actually dependent on a retrieval ranking, we con- sider here two approaches: a) the perfect-relevance approach, which consists of using a ranking where all sentences are relevant; and b) the non-perfect rel- evance approach, which consists of applying first a sentence retrieval method. We rst study which baseline performs the best and, next, we propose a number of variations. One of the mechanisms proposed is based on vocab- ulary pruning. We demonstrate that considering terms from the top ranked sentences in the original ranking helps to guide the estimation of novelty. The application of Language Models to support novelty detection is another chal- lenge that we face in this thesis. We apply di erent smoothing methods in the context of alternative mechanisms to detect novelty. Additionally, we test a mechanism based on mixture models that uses the Expectation-Maximization algorithm to obtain automatically the novelty score of a sentence. In the last part of this work we demonstrate that most novelty methods lead to a strong re-ordering of the initial ranking. However, we show that the top ranked sentences in the initial list are usually novel and re-ordering them is often harmful. Therefore, we propose di erent mechanisms that determine the position threshold where novelty detection should be initiated. In this respect, we consider query-independent and query-dependent approaches. Summing up, we identify important limitations of current sentence re- trieval and novelty methods, and propose novel and effective methods

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositorio Institucional da Universidade de Santiago de Compostela

A systematic analysis of sentence update detection for temporal summarization

Author: Gârbacea C.
Kanoulas E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE