Search CORE

48 research outputs found

A Longitudinal Study of Exploratory and Keyword Search

Author: schraefel m.c.
Wilson Max L.
Publication venue
Publication date
Field of study

Digital libraries are concerned with improving the access to collections to make their service more effective and valuable to users. In this paper, we present the results of a four-week longitudinal study investigating the use of both exploratory and keyword forms of search within an online video archive, where both forms of search were available concurrently in a single user interface. While we expected early use to be more exploratory and subsequent use to be directed, over the whole period there was a balance of exploratory and keyword searches and they were often used together. Further, to support the notion that facets support exploration, there were more than five times as many facet clicks than more complex forms of keyword search (boolean and advanced). From these results, we can conclude that there is real value in investing in exploratory search support, which was shown to be both popular and useful for extended use of the system

Southampton (e-Prints Soton)

A Validated Framework for Measuring Interface Support for Interactive Information Seeking

Author: schraefel m.c.
Wilson Max L.
Publication venue: s.n.
Publication date
Field of study

In this paper we present the validation of an evaluation framework that models the support provided by search systems for different types of user and their expected types of seeking behavior. Factors determining the types of users include previous knowledge and goals. After an overview is presented, the framework is validated in two ways. First, the novel integration of the two existing information-seeking models used in the framework is validated by the correlation of multiple expert and novice analysis. Second, the framework is validated against the results produced by two separated user studies. Further, the refinements made by the first validation technique are shown to increase the accuracy of the framework through the second technique. The successful validation process has shown that the framework can identify both strong and weak areas of search interface design in only a few hours. The results produced can be used to either revise and strengthen designs or inform the structure of a user study

Southampton (e-Prints Soton)

Distinción semántica de compuestos léxicos en recuperación de información

Author: Gonzalo Arroyo Julio
Peñas Padilla Anselmo
Verdejo Maillo María Felisa
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2002
Field of study

La consideración de sintagmas no parece producir mejoras significativas en los modelos clásicos de Recuperación de Información. En general, se acepta que los criterios de proximidad proporcionan mejores resultados que un criterio de adyacencia. El trabajo que se presenta explora la hipótesis de que no todos los compuestos léxicos deben considerarse de la misma forma. Se propone un procedimiento automático de clasificación semántica de los compuestos léxicos de WordNet sobre la base de sus componentes, y se estudia cómo afecta esta distinción a la Recuperación de Información.Este trabajo ha sido parcialmente financiado por el Ministerio de Ciencia y Tecnología a través del proyecto Hermes (TIC2000-0335-C03-01)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Secretaría de Estado de Cultura

Extracting Conceptual Terms from Medical Documents

Author: Bot Ravzan Stefan
Chen Xin
Li Quanzhhi
Wu Yi-Fang Brook
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2005
Field of study

Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which become inputs of next step. Step two includes keyphrase extraction, which can automatically identify important topical terms from candidate terms. Experiments were conducted to evaluate results of both steps. The experiment results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying document conceptual terms

AIS Electronic Library (AISeL)

More Effective Web Search Using Bigrams and Trigrams

Author: Johnson D
Malhotra V
Vamplew P
Publication venue
Publication date: 01/01/2006
Field of study

This paper investigates the effectiveness of quoted bigrams and trigrams as query terms to target web search. Prior research in this area has largely focused on static corpora each containing only a few million documents, and has reported mixed (usually negative) results. We investigate the bigram/trigram extraction problem and present an extraction algorithm that shows promising results when applied to real-time web search. We also present a prototype augmented search software package that can leverage the results provided by a web search engine to assist the web searcher identify important phrases and related documents quickly. This software has received favourable feedback in a recent user survey

Directory of Open Access Journals

Federation ResearchOnline

University of Tasmania Open Access Repository

Evaluating the Potential of Explicit Phrases for Retrieval Quality

Author: Andreas Broschart
Klaus Berberich
Ralf Schenkel
Publication venue
Publication date: 30/04/2020
Field of study

Abstract. This paper evaluates the potential impact of explicit phrases on retrieval quality through a case study with the TREC Terabyte benchmark. It compares the performance of user-and system-identified phrases with a standard score and a proximity-aware score, and shows that an optimal choice of phrases, including term permutations, can significantly improve query performance

CiteSeerX

Social Search with Missing Data: Which Ranking Algorithm?

Author: Denham Chris
Eisenstadt Marc
Goncalves Alexandre
Song Dawei
Uren Victoria
Zhu Jianhan
Publication venue
Publication date: 01/10/2007
Field of study

Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods

Open Access Institutional Repository at Robert Gordon University

Open Research Online (The Open University)

Extraction of Keyphrases from Text: Evaluation of Four Algorithms

Author: Turney Peter
Publication venue
Publication date: 01/01/1997
Field of study

This report presents an empirical evaluation of four algorithms for automatically extracting keywords and keyphrases from documents. The four algorithms are compared using five different collections of documents. For each document, we have a target set of keyphrases, which were generated by hand. The target keyphrases were generated for human readers; they were not tailored for any of the four keyphrase extraction algorithms. Each of the algorithms was evaluated by the degree to which the algorithms keyphrases matched the manually generated keyphrases. The four algorithms were (1) the AutoSummarize feature in Microsofts Word 97, (2) an algorithm based on Eric Brills part-of-speech tagger, (3) the Summarize feature in Veritys Search 97, and (4) NRCs Extractor algorithm. For all five document collections, NRCs Extractor yields the best match with the manually generated keyphrases

CiteSeerX

NRC Publications Archive

CogPrints Cognitive Sciences Eprint Archive