Search CORE

8,360 research outputs found

Recommended from our members

Local search: A guide for the information retrieval practitioner

Author: Abramson
Althofer
Andrew MacFarlane
Andrew Tuson
Baeck
Battiti
Boughanem
Cartwright
Chen
Chen
Chen
Cleverdon
Collins
Cordon
Cordon
Corne
Darwin
Dorigo
Downsland
Dueck
Fan
Fan
Fan
Fan
Feo
Fernandez-Villacanas Martin
Fogel
Fogel
Frakes
Frakes
Garey
Glover
Glover
Glover
Goldberg
Hajek
Harman
Harman
Harman
Harman
Hasan
Hawking
Hertz
Hertz
Holland
Hooker
Horng
Kekäläinen
Kirkpatrick
Koza
Kuflik
Lam
Lopez-Pujalte
Lopez-Pujalte
Lopez-Pujalte
Luke
Lundy
Martin-Bautisata
Masters
Michalewicz
Mock
Mock
Newell
Ogbu
Oliveira
Osman
Osman
Osman
Osman
Papadimitriou
Pohlheim
Rechenburg
Reeves
Reeves
Robertson
Sebastiani
Semet
Sinclair
Smith
Sparck Jones
Stefik
Tamine
Thangiah
Trotman
Van Laarhoven
Vrajitoru
Wartik
Yang
Zweben
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR

City Research Online

Crossref

A novel approach integrating ranking functions discovery, optimization and infernce to improve retrieval performance

Author: Adesina Ademola Olusola
Agbele Kehinde K.
Febba Ronald
Nyongesa Henry O.
Publication venue: Medwell Journals
Publication date: 01/01/2010
Field of study

The significant roles play by ranking function in the performance and success of Information Retrieval (IR) systems and search engines cannot be underestimated. Diverse ranking functions are available in IR literature. However, empirical studies show that ranking functions do not perform constantly well across different contexts (queries, collections, users). In this study, a novel three-stage integrated ranking framework is proposed for implementing discovering, optimizing and inference rankings used in IR systems. The first phase, discovery process is based on Genetic Programming (GP) approach which smartly combines structural and contents features in the documents while the second phase, optimization process is based on Genetic Algorithm (GA) which combines document retrieval scores of various well-known ranking functions. In the 3rd phase, Fuzzy inference proves as soft search constraints to be applied on documents. We demonstrate how these two features are combined to bring new tasks and processes within the three concept stages of integrated framework for effective IR

University of the Western Cape Research Repository

Intelligent Fusion of Structural and Citation-Based Evidence for Text Classification

Author: Calado Pavel
Chen Yuxin
Cristo Marco
Fan Weiguo
Fox Edward A.
Goncalves Marcos Andre
Zhang Baoping
Publication venue
Publication date: 01/01/2004
Field of study

This paper investigates how citation-based information and structural content (e.g., title, abstract) can be combined to improve classification of text documents into predefined categories. We evaluate different measures of similarity, five derived from the citation structure of the collection, and three measures derived from the structural content, and determine how they can be fused to improve classification effectiveness. To discover the best fusion framework, we apply Genetic Programming (GP) techniques. Our empirical experiments using documents from the ACM digital library and the ACM classification scheme show that we can discover similarity functions that work better than any evidence in isolation and whose combined performance through a simple majority voting is comparable to that of Support Vector Machine classifiers

Computer Science Technical Reports @Virginia Tech

Personalization of Search Engine Services for Effective Retrieval and Knowledge Management

Author: Fan Weiguo
Gordon Michael
Pathak Praveen
Publication venue: AIS Electronic Library (AISeL)
Publication date: 31/12/2000
Field of study

The Internet and corporate intranets provide far more information than anybody can absorb. People use search engines to find the information they require. However, these systems tend to use only one fixed term weighting strategy regardless of the context to which it applies, posing serious performance problems when characteristics of different users, queries, and text collections are taken into consideration. In this paper, we argue that the term weighting strategy should be context specific, that is, different term weighting strategies should be applied to different contexts, and we propose a new systematic approach that can automatically generate term weighting strategies for different contexts based on genetic programming (GP). The new proposed framework was tested on TREC data and the results are very promising

AIS Electronic Library (AISeL)

Evaluation of parameters for combining multiple textual sources of evidence for Web image retrieval using genetic programming

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

Recuperação multimodal e interativa de informação orientada por diversidade

Author: Calumby Rodrigo Tripodi, 1985-
Publication venue: [s.n.]
Publication date: 29/08/2018
Field of study

Orientador: Ricardo da Silva TorresTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os métodos de Recuperação da Informação, especialmente considerando-se dados multimídia, evoluíram para a integração de múltiplas fontes de evidência na análise de relevância de itens em uma tarefa de busca. Neste contexto, para atenuar a distância semântica entre as propriedades de baixo nível extraídas do conteúdo dos objetos digitais e os conceitos semânticos de alto nível (objetos, categorias, etc.) e tornar estes sistemas adaptativos às diferentes necessidades dos usuários, modelos interativos que consideram o usuário mais próximo do processo de recuperação têm sido propostos, permitindo a sua interação com o sistema, principalmente por meio da realimentação de relevância implícita ou explícita. Analogamente, a promoção de diversidade surgiu como uma alternativa para lidar com consultas ambíguas ou incompletas. Adicionalmente, muitos trabalhos têm tratado a ideia de minimização do esforço requerido do usuário em fornecer julgamentos de relevância, à medida que mantém níveis aceitáveis de eficácia. Esta tese aborda, propõe e analisa experimentalmente métodos de recuperação da informação interativos e multimodais orientados por diversidade. Este trabalho aborda de forma abrangente a literatura acerca da recuperação interativa da informação e discute sobre os avanços recentes, os grandes desafios de pesquisa e oportunidades promissoras de trabalho. Nós propusemos e avaliamos dois métodos de aprimoramento do balanço entre relevância e diversidade, os quais integram múltiplas informações de imagens, tais como: propriedades visuais, metadados textuais, informação geográfica e descritores de credibilidade dos usuários. Por sua vez, como integração de técnicas de recuperação interativa e de promoção de diversidade, visando maximizar a cobertura de múltiplas interpretações/aspectos de busca e acelerar a transferência de informação entre o usuário e o sistema, nós propusemos e avaliamos um método multimodal de aprendizado para ranqueamento utilizando realimentação de relevância sobre resultados diversificados. Nossa análise experimental mostra que o uso conjunto de múltiplas fontes de informação teve impacto positivo nos algoritmos de balanceamento entre relevância e diversidade. Estes resultados sugerem que a integração de filtragem e re-ranqueamento multimodais é eficaz para o aumento da relevância dos resultados e também como mecanismo de potencialização dos métodos de diversificação. Além disso, com uma análise experimental minuciosa, nós investigamos várias questões de pesquisa relacionadas à possibilidade de aumento da diversidade dos resultados e a manutenção ou até mesmo melhoria da sua relevância em sessões interativas. Adicionalmente, nós analisamos como o esforço em diversificar afeta os resultados gerais de uma sessão de busca e como diferentes abordagens de diversificação se comportam para diferentes modalidades de dados. Analisando a eficácia geral e também em cada iteração de realimentação de relevância, nós mostramos que introduzir diversidade nos resultados pode prejudicar resultados iniciais, enquanto que aumenta significativamente a eficácia geral em uma sessão de busca, considerando-se não apenas a relevância e diversidade geral, mas também o quão cedo o usuário é exposto ao mesmo montante de itens relevantes e nível de diversidadeAbstract: Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness. This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This work, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevance-diversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal learning-to-rank method trained with relevance feedback over diversified results. Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking was effective on improving result relevance and also boosted diversity promotion methods. Beyond it, with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversityDoutoradoCiência da ComputaçãoDoutor em Ciência da ComputaçãoP-4388/2010140977/2012-0CAPESCNP

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

Science of Digital Libraries(SciDL)

Author: Carroll John
Cassel Lillian
Fan Patrick
Fox Edward
Halbert Martin
Maly Kurt
McMillan Gail
Ramakrishnan Naren
Zubair Mohammed
Publication venue
Publication date: 01/01/2003
Field of study

Our purpose is to ensure that people and institutions better manage information through digital libraries (DLs). Thus we address a fundamental human and social need, which is particularly urgent in the modern Information (and Knowledge) Age. Our goal is to significantly advance both the theory and state-of-theart of DLs (and other advanced information systems) - thoroughly validating our approach using highly visible testbeds. Our research objective is to leverage our formal, theory-based approach to the problems of defining, understanding, modeling, building, personalizing, and evaluating DLs. We will construct models and tools based on that theory so organizations and individuals can easily create and maintain fully functional DLs, whose components can interoperate with corresponding components of related DLs. This research should be highly meritorious intellectually. We bring together a team of senior researchers with expertise in information retrieval, human-computer interaction, scenario-based design, personalization, and componentized system development and expect to make important contributions in each of those areas. Of crucial import, however, is that we will integrate our prior research and experience to achieve breakthrough advances in the field of DLs, regarding theory, methodology, systems, and evaluation. We will extend the 5S theory, which has identified five key dimensions or onstructs underlying effective DLs: Streams, Structures, Spaces, Scenarios, and Societies. We will use that theory to describe and develop metamodels, models, and systems, which can be tailored to disciplines and/or groups, as well as personalized. We will disseminate our findings as well as provide toolkits as open source software, encouraging wide use. We will validate our work using testbeds, ensuring broad impact. We will put powerful tools into the hands of digital librarians so they may easily plan and configure tailored systems, to support an extensible set of services, including publishing, discovery, searching, browsing, recommending, and access control, handling diverse types of collections, and varied genres and classes of digital objects. With these tools, end-users will for be able to design personal DLs. Testbeds are crucial to validate scientific theories and will be thoroughly integrated into SciDL research and evaluation. We will focus on two application domains, which together should allow comprehensive validation and increase the significance of SciDL's impact on scholarly communities. One is education (through CITIDEL); the other is libraries (through DLA and OCKHAM). CITIDEL deals with content from publishers (e.g, ACM Digital Library), corporate research efforts e.g., CiteSeer), volunteer initiatives (e.g., DBLP, based on the database and logic rogramming literature), CS departments (e.g., NCSTRL, mostly technical reports), educational initiatives (e.g., Computer Science Teaching Center), and universities (e.g., theses and dissertations). DLA is a unit of the Virginia Tech library that virtually publishes scholarly communication such as faculty-edited journals and rare and unique resources including image collections and finding aids from Special Collections. The OCKHAM initiative, calling for simplicity in the library world, emphasizes a three-part solution: lightweightprotocols, component-based development, and open reference models. It provides a framework to research the deployment of the SciDL approach in libraries. Thus our choice of testbeds also will nsure that our research will have additional benefit to and impact on the fields of computing and library and information science, supporting transformations in how we learn and deal with information

Computer Science Technical Reports @Virginia Tech

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

Living analytics methods for the social web

Author: Diaz-Aviles Ernesto
Publication venue: Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2013
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover