39 research outputs found
Evaluation Measures for Relevance and Credibility in Ranked Lists
Recent discussions on alternative facts, fake news, and post truth politics
have motivated research on creating technologies that allow people not only to
access information, but also to assess the credibility of the information
presented to them by information retrieval systems. Whereas technology is in
place for filtering information according to relevance and/or credibility, no
single measure currently exists for evaluating the accuracy or precision (and
more generally effectiveness) of both the relevance and the credibility of
retrieved results. One obvious way of doing so is to measure relevance and
credibility effectiveness separately, and then consolidate the two measures
into one. There at least two problems with such an approach: (I) it is not
certain that the same criteria are applied to the evaluation of both relevance
and credibility (and applying different criteria introduces bias to the
evaluation); (II) many more and richer measures exist for assessing relevance
effectiveness than for assessing credibility effectiveness (hence risking
further bias).
Motivated by the above, we present two novel types of evaluation measures
that are designed to measure the effectiveness of both relevance and
credibility in ranked lists of retrieval results. Experimental evaluation on a
small human-annotated dataset (that we make freely available to the research
community) shows that our measures are expressive and intuitive in their
interpretation
Minería de textos y de la web
Este artículo describe, brevemente, las tareas de investigación y desarrollo que se están llevando a cabo en la línea de investigación “Minería de Textos y de la Web” en el marco del proyecto “Aprendizaje automático y toma de decisiones en sistemas inteligentes para la Web”. La linea aborda diversas áreas vinculadas a la ingeniería del lenguaje natural, como por ejemplo el Procesamiento del Lenguaje Natural (PLN), la Lingüística Computacional, la Minería de Textos, la Minería de la Web y la recuperación de información de la Web. En el contexto de este proyecto por lo tanto, esta línea se centra en todos los problemas vinculados con el desarrollo de herramientas inteligentes para la extracción, análisis y validación de contenido Web, que incluyen: representación de documentos y usuarios de la Web, medidas de calidad de información para el contenido Web, técnicas abiertas de extracción de información para la Web, algoritmos de categorización supervisados, semi-supervisados y no supervisados y caracterización de usuarios, entre otros.Eje: Bases de Datos y Minería de DatosRed de Universidades con Carreras en Informática (RedUNCI
Minería de textos y de la web
Este artículo describe, brevemente, las tareas de investigación y desarrollo que se están llevando a cabo en la línea de investigación “Minería de Textos y de la Web” en el marco del proyecto “Aprendizaje automático y toma de decisiones en sistemas inteligentes para la Web”. La linea aborda diversas áreas vinculadas a la ingeniería del lenguaje natural, como por ejemplo el Procesamiento del Lenguaje Natural (PLN), la Lingüística Computacional, la Minería de Textos, la Minería de la Web y la recuperación de información de la Web. En el contexto de este proyecto por lo tanto, esta línea se centra en todos los problemas vinculados con el desarrollo de herramientas inteligentes para la extracción, análisis y validación de contenido Web, que incluyen: representación de documentos y usuarios de la Web, medidas de calidad de información para el contenido Web, técnicas abiertas de extracción de información para la Web, algoritmos de categorización supervisados, semi-supervisados y no supervisados y caracterización de usuarios, entre otros.Eje: Bases de Datos y Minería de DatosRed de Universidades con Carreras en Informática (RedUNCI
Enabling the Discovery of Digital Cultural Heritage Objects through Wikipedia
Over the past years large digital cultural heritage collections have become increasingly available. While these provide adequate search functionality for the expert user, this may not offer the best support for non-expert or novice users. In this paper we propose a novel mechanism for introducing new users to the items in a collection by allowing them to browse Wikipedia articles, which are augmented with items from the cultural heritage collection. Using Europeana as a case-study we demonstrate the effectiveness of our approach for encouraging users to spend longer exploring items in Europeana compared with the existing search provision
CLEAR: a credible method to evaluate website archivability
Web archiving is crucial to ensure that cultural, scientific
and social heritage on the web remains accessible and usable
over time. A key aspect of the web archiving process is optimal data extraction from target websites. This procedure is
difficult for such reasons as, website complexity, plethora of
underlying technologies and ultimately the open-ended nature of the web. The purpose of this work is to establish
the notion of Website Archivability (WA) and to introduce
the Credible Live Evaluation of Archive Readiness (CLEAR)
method to measure WA for any website. Website Archivability captures the core aspects of a website crucial in diagnosing whether it has the potentiality to be archived with completeness and accuracy. An appreciation of the archivability
of a web site should provide archivists with a valuable tool
when assessing the possibilities of archiving material and in-
uence web design professionals to consider the implications
of their design decisions on the likelihood could be archived.
A prototype application, archiveready.com, has been established to demonstrate the viabiity of the proposed method
for assessing Website Archivability
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe