13 research outputs found
Recommended from our members
Evaluating combinations of ranked lists and visualizations of inter-document similarity
We are interested in how ideas from document clustering can be used to improve the retrieval accuracy of ranked lists in interactive systems. In particular, we are interested in ways to evaluate the e€ectiveness of such systems to decide how they might best be constructed. In this study, we construct and evaluate systems that present the user with ranked lists and a visualization of inter-document similarities. We ®rst carry out a user study to evaluate the clustering/ranked list combination on instance-oriented retrieval, the task of the TREC-6 Interactive Track. We ®nd that although users generally prefer the combination, they are not able to use it to improve e€ectiveness. In the second half of this study, we develop and evaluate an approach that more directly combines the ranked list with information from inter-document similarities. Using the TREC collections and relevance judgments, we show that it is possible to realize substantial improvements in e€ectiveness by doing so, and that although users can use the combined information e€ectively, the system can provide hints that substantially improve on the user\u27s solo e€ort. The resulting approach shares much in common with an interactive application of incremental relevance feedback. Throughout this study, we illustrate our work using two prototype systems constructed for these evaluations. The ®rst, AspInQuery, is a classic information retrieval system augmented with a specialized tool for recording information about instances of relevance. The other system, Lighthouse, is a Web-based application that combines a ranked list with a portrayal of inter-document similarity. Lighthouse can work with collections such as TREC, as well as the results of Web search engines
Recommended from our members
Evaluating combinations of ranked lists and visualizations of inter-document similarity
We are interested in how ideas from document clustering can be used to improve the retrieval accuracy of ranked lists in interactive systems. In particular, we are interested in ways to evaluate the e€ectiveness of such systems to decide how they might best be constructed. In this study, we construct and evaluate systems that present the user with ranked lists and a visualization of inter-document similarities. We ®rst carry out a user study to evaluate the clustering/ranked list combination on instance-oriented retrieval, the task of the TREC-6 Interactive Track. We ®nd that although users generally prefer the combination, they are not able to use it to improve e€ectiveness. In the second half of this study, we develop and evaluate an approach that more directly combines the ranked list with information from inter-document similarities. Using the TREC collections and relevance judgments, we show that it is possible to realize substantial improvements in e€ectiveness by doing so, and that although users can use the combined information e€ectively, the system can provide hints that substantially improve on the user\u27s solo e€ort. The resulting approach shares much in common with an interactive application of incremental relevance feedback. Throughout this study, we illustrate our work using two prototype systems constructed for these evaluations. The ®rst, AspInQuery, is a classic information retrieval system augmented with a specialized tool for recording information about instances of relevance. The other system, Lighthouse, is a Web-based application that combines a ranked list with a portrayal of inter-document similarity. Lighthouse can work with collections such as TREC, as well as the results of Web search engines
Recommended from our members
Classifying complex topics using spatial-semantic document visualization: An evaluation of an interaction model to support open-ended search tasks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.In this dissertation we propose, test and develop a novel search interaction model to address two key problems associated with conducting an open-ended search task within a classical information retrieval system: (i) the need to reformulate the query within the context of a shifting conception of the problem and (ii) the need to integrate relevant results across a number of separate results sets. In our model the user issues just one highrecall query and then performs a sequence of more focused, distinct aspect searches by
browsing the static structured context of a spatial-semantic visualization of this retrieved
document set. Our thesis is that unsupervised spatial-semantic visualization can automatically classify retrieved documents into a two-level hierarchy of relevance. In particular we hypothesise that the locality of any given aspect exemplar will tend to comprise a sufficient proportion of same-aspect documents to support a visually guided strategy for focused, same-aspect searching that we term the aspect cluster growing
strategy. We examine spatial-semantic classification and potential aspect cluster growing performance across three scenarios derived from topics and relevance judgements from
the TREC test collection. Our analyses show that the expected classification can be represented in spatial-semantic structures created from document similarities computed by a simple vector space text analysis procedure. We compare two diametrically opposed approaches to layout optimisation: a global approach that focuses on preserving the all similarities and a local approach that focuses only on the strongest similarities. We find that the local approach, based on a minimum spanning tree of similarities, produces a better classification and, as observed from strategy simulation, more efficient aspect cluster growing performance in most situations, compared to the global approach of multidimensional scaling. We show that a small but significant proportion of aspect clustering
growing cases can be problematic, regardless of the layout algorithm used. We identify the
characteristics of these cases and, on this basis, demonstrate a set of novel interactive tools that provide additional semantic cues to aid the user in locating same-aspect documents
Visualización de esquemas de representación de conocimiento para el acceso a recursos en repositorios digitales
El siguiente documento presenta los resultados de investigación realizados a partir de estudios enfocados en el desarrollo e implementación de interfaces de búsqueda de objetos de aprendizaje, a partir de técnicas de visualización sobre repositorios digitales. Actualmente existen una gran cantidad de recursos digitales sobre Internet, y el acceso a los mismos en gran medida dependen de las estrategias que puedan ofrecer motores de búsquedas convencionales, o soluciones especializadas que permitan su clasificación, gestión y administración, como es el caso de los repositorios digitales. Sin embargo, existen una serie de factores que influyen sobre el acceso a los mismos, partiendo de la definición de los metadatos, y las estrategias de búsqueda que se definan sobre grandes volúmenes de información. Una de las áreas de mayor aceptación a lo largo de los últimos años es la visualización de información, área de trabajo que facilita la presentación visual de información compleja haciendo uso adecuado de espacios y estructuras gráficas, con el fin de facilitar su rápida asimilación y comprensión. Por lo tanto, para los propósitos específicos de esta investigación, abordaremos el área de visualización de información mediante el uso de metodologías de evaluación y estrategias de diseño para el desarrollo e implementación de interfaces de búsquedas efectivas, para el acceso a colecciones de recursos digitales alojados en repositorios digitales. El propósito fundamental de esta investigación es ofrecer alternativas de acceso a partir de técnicas de visualización, para facilitar a creadores de repositorios digitales el análisis, desarrollo e implementación de interfaces de búsqueda visual
Classifying complex topics using spatial-semantic document visualization : an evaluation of an interaction model to support open-ended search tasks
In this dissertation we propose, test and develop a novel search interaction model to address two key problems associated with conducting an open-ended search task within a classical information retrieval system: (i) the need to reformulate the query within the context of a shifting conception of the problem and (ii) the need to integrate relevant results across a number of separate results sets. In our model the user issues just one highrecall query and then performs a sequence of more focused, distinct aspect searches by browsing the static structured context of a spatial-semantic visualization of this retrieved document set. Our thesis is that unsupervised spatial-semantic visualization can automatically classify retrieved documents into a two-level hierarchy of relevance. In particular we hypothesise that the locality of any given aspect exemplar will tend to comprise a sufficient proportion of same-aspect documents to support a visually guided strategy for focused, same-aspect searching that we term the aspect cluster growing strategy. We examine spatial-semantic classification and potential aspect cluster growing performance across three scenarios derived from topics and relevance judgements from the TREC test collection. Our analyses show that the expected classification can be represented in spatial-semantic structures created from document similarities computed by a simple vector space text analysis procedure. We compare two diametrically opposed approaches to layout optimisation: a global approach that focuses on preserving the all similarities and a local approach that focuses only on the strongest similarities. We find that the local approach, based on a minimum spanning tree of similarities, produces a better classification and, as observed from strategy simulation, more efficient aspect cluster growing performance in most situations, compared to the global approach of multidimensional scaling. We show that a small but significant proportion of aspect clustering growing cases can be problematic, regardless of the layout algorithm used. We identify the characteristics of these cases and, on this basis, demonstrate a set of novel interactive tools that provide additional semantic cues to aid the user in locating same-aspect documents.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Classifying complex topics using spatial-semantic document visualization : an evaluation of an interaction model to support open-ended search tasks
In this dissertation we propose, test and develop a novel search interaction model to address two key problems associated with conducting an open-ended search task within a classical information retrieval system: (i) the need to reformulate the query within the context of a shifting conception of the problem and (ii) the need to integrate relevant results across a number of separate results sets. In our model the user issues just one highrecall query and then performs a sequence of more focused, distinct aspect searches by browsing the static structured context of a spatial-semantic visualization of this retrieved document set. Our thesis is that unsupervised spatial-semantic visualization can automatically classify retrieved documents into a two-level hierarchy of relevance. In particular we hypothesise that the locality of any given aspect exemplar will tend to comprise a sufficient proportion of same-aspect documents to support a visually guided strategy for focused, same-aspect searching that we term the aspect cluster growing strategy. We examine spatial-semantic classification and potential aspect cluster growing performance across three scenarios derived from topics and relevance judgements from the TREC test collection. Our analyses show that the expected classification can be represented in spatial-semantic structures created from document similarities computed by a simple vector space text analysis procedure. We compare two diametrically opposed approaches to layout optimisation: a global approach that focuses on preserving the all similarities and a local approach that focuses only on the strongest similarities. We find that the local approach, based on a minimum spanning tree of similarities, produces a better classification and, as observed from strategy simulation, more efficient aspect cluster growing performance in most situations, compared to the global approach of multidimensional scaling. We show that a small but significant proportion of aspect clustering growing cases can be problematic, regardless of the layout algorithm used. We identify the characteristics of these cases and, on this basis, demonstrate a set of novel interactive tools that provide additional semantic cues to aid the user in locating same-aspect documents.EThOS - Electronic Theses Online ServiceGBUnited Kingdo