95 research outputs found

    Intelligent search for distributed information sources using heterogeneous neural networks

    Get PDF
    As the number and diversity of distributed information sources on the Internet exponentially increase, various search services are developed to help the users to locate relevant information. But they still exist some drawbacks such as the difficulty of mathematically modeling retrieval process, the lack of adaptivity and the indiscrimination of search. This paper shows how heteroge-neous neural networks can be used in the design of an intelligent distributed in-formation retrieval (DIR) system. In particular, three typical neural network models - Kohoren's SOFM Network, Hopfield Network, and Feed Forward Network with Back Propagation algorithm are introduced to overcome the above drawbacks in current research of DIR by using their unique properties. This preliminary investigation suggests that Neural Networks are useful tools for intelligent search for distributed information sources

    Combining link and content-based information in a Bayesian inference model for entity search

    No full text
    An architectural model of a Bayesian inference network to support entity search in semantic knowledge bases is presented. The model supports the explicit combination of primitive data type and object-level semantics under a single computational framework. A flexible query model is supported capable to reason with the availability of simple semantics in querie

    Preliminary Experiments using Subjective Logic for the Polyrepresentation of Information Needs

    Full text link
    According to the principle of polyrepresentation, retrieval accuracy may improve through the combination of multiple and diverse information object representations about e.g. the context of the user, the information sought, or the retrieval system. Recently, the principle of polyrepresentation was mathematically expressed using subjective logic, where the potential suitability of each representation for improving retrieval performance was formalised through degrees of belief and uncertainty. No experimental evidence or practical application has so far validated this model. We extend the work of Lioma et al. (2010), by providing a practical application and analysis of the model. We show how to map the abstract notions of belief and uncertainty to real-life evidence drawn from a retrieval dataset. We also show how to estimate two different types of polyrepresentation assuming either (a) independence or (b) dependence between the information objects that are combined. We focus on the polyrepresentation of different types of context relating to user information needs (i.e. work task, user background knowledge, ideal answer) and show that the subjective logic model can predict their optimal combination prior and independently to the retrieval process

    Optimal Information Retrieval with Complex Utility Functions

    Get PDF
    Existing retrieval models all attempt to optimize one single utility function, which is often based on the topical relevance of a document with respect to a query. In real applications, retrieval involves more complex utility functions that may involve preferences on several different dimensions. In this paper, we present a general optimization framework for retrieval with complex utility functions. A query language is designed according to this framework to enable users to submit complex queries. We propose an efficient algorithm for retrieval with complex utility functions based on the a-priori algorithm. As a case study, we apply our algorithm to a complex utility retrieval problem in distributed IR. Experiment results show that our algorithm allows for flexible tradeoff between multiple retrieval criteria. Finally, we study the efficiency issue of our algorithm on simulated data

    The relationship between IR and multimedia databases

    Get PDF
    Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud \ud Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud \ud Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud \ud First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud \ud Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud \ud Third, we add the functionality to process the users' relevance feedback.\ud \ud We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud \ud We conclude with an outline for implementation of miRRor on top of the Monet extensible database system

    A document management methodology based on similarity contents

    Get PDF
    The advent of the WWW and distributed information systems have made it possible to share documents between different users and organisations. However, this has created many problems related to the security, accessibility, right and most importantly the consistency of documents. It is important that the people involved in the documents management process have access to the most up-to-date version of documents, retrieve the correct documents and should be able to update the documents repository in such a way that his or her document are known to others. In this paper we propose a method for organising, storing and retrieving documents based on similarity contents. The method uses techniques based on information retrieval, document indexation and term extraction and indexing. This methodology is developed for the E-Cognos project which aims at developing tools for the management and sharing of documents in the construction domain

    Evaluation of a Bayesian inference network for ligand-based virtual screening

    Get PDF
    Background Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity. Results Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought. Conclusion A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening

    Exploitation des connaissances d'UMLS pour la recherche d'information médicale. Vers un modèle bayésien d'indexation

    No full text
    National audienceKnowledge-based information retrieval is widely exploited, but still not very well successful. In this paper, we aim to study the impact of explorations of a knowledge source, named UMLS meta-thesaurus, in medical domain information retrieval. The integration of semantic labels of concepts in a multi-layer indexing proves quite encouraging results in ImageCLEF 2006 evaluation forum. In order to reach a more semantic rich information retrieval system, we tend to explore the hierarchy relationships between concepts in UMLS. For this purpose, we propose a Bayesian network based model to capture the general-specific links between query concepts and documents concepts.La recherche d'information à base de connaissances est largement exploitée, mais avec peu de succès. Dans cet article, nous étudions l'impact de l'exploration d'une base de connaissance, nommée meta-thésaurus UMLS, dans la recherche d'information médicale. L'intégration des étiquettes sémantiques des concepts dans une indexation multicouche donne des résultats encouragants dans ImageCLEF 2006 forum d'évaluation. Afin d'atteindre un système de recherche d'information sémantiquement plus riche , nous explorons les relations hiérarchiques entre concepts dans UMLS. Dans ce but, nous proposons donc un modèle basé sur le réseau Bayesien pour capturer les liens général-specifique entre concepts de la requête et ceux des documents

    Non-Compositional Term Dependence for Information Retrieval

    Full text link
    Modelling term dependence in IR aims to identify co-occurring terms that are too heavily dependent on each other to be treated as a bag of words, and to adapt the indexing and ranking accordingly. Dependent terms are predominantly identified using lexical frequency statistics, assuming that (a) if terms co-occur often enough in some corpus, they are semantically dependent; (b) the more often they co-occur, the more semantically dependent they are. This assumption is not always correct: the frequency of co-occurring terms can be separate from the strength of their semantic dependence. E.g. "red tape" might be overall less frequent than "tape measure" in some corpus, but this does not mean that "red"+"tape" are less dependent than "tape"+"measure". This is especially the case for non-compositional phrases, i.e. phrases whose meaning cannot be composed from the individual meanings of their terms (such as the phrase "red tape" meaning bureaucracy). Motivated by this lack of distinction between the frequency and strength of term dependence in IR, we present a principled approach for handling term dependence in queries, using both lexical frequency and semantic evidence. We focus on non-compositional phrases, extending a recent unsupervised model for their detection [21] to IR. Our approach, integrated into ranking using Markov Random Fields [31], yields effectiveness gains over competitive TREC baselines, showing that there is still room for improvement in the very well-studied area of term dependence in IR

    Information Retrieval Models

    Get PDF
    Many applications that handle information on the internet would be completely\ud inadequate without the support of information retrieval technology. How would\ud we find information on the world wide web if there were no web search engines?\ud How would we manage our email without spam filtering? Much of the development\ud of information retrieval technology, such as web search engines and spam\ud filters, requires a combination of experimentation and theory. Experimentation\ud and rigorous empirical testing are needed to keep up with increasing volumes of\ud web pages and emails. Furthermore, experimentation and constant adaptation\ud of technology is needed in practice to counteract the effects of people that deliberately\ud try to manipulate the technology, such as email spammers. However,\ud if experimentation is not guided by theory, engineering becomes trial and error.\ud New problems and challenges for information retrieval come up constantly.\ud They cannot possibly be solved by trial and error alone. So, what is the theory\ud of information retrieval?\ud There is not one convincing answer to this question. There are many theories,\ud here called formal models, and each model is helpful for the development of\ud some information retrieval tools, but not so helpful for the development others.\ud In order to understand information retrieval, it is essential to learn about these\ud retrieval models. In this chapter, some of the most important retrieval models\ud are gathered and explained in a tutorial style
    corecore