39 research outputs found

    Exploiting semantics for improving clinical information retrieval

    Get PDF
    Clinical information retrieval (IR) presents several challenges including terminology mismatch and granularity mismatch. One of the main objectives in clinical IR is to fill the semantic gap among the queries and documents and going beyond keywords matching. To address these issues, in this study we attempt to use semantic information to improve the performance of clinical IR systems by representing queries in an expressive and meaningful context. In this study we propose query context modeling to improve the effectiveness of clinical IR systems. To model query contexts we propose two novel approaches to modeling medical query contexts. The first approach concerns modeling medical query contexts based on mining semantic-based AR for improving clinical text retrieval. The query context is derived from the rules that cover the query and then weighted according to their semantic relatedness to the query concepts. In our second approach we model a representative query context by developing query domain ontology. To develop query domain ontology we extract all the concepts that have semantic relationship with the query concept(s) in UMLS ontologies. Query context represents concepts extracted from query domain ontology and weighted according to their semantic relatedness to the query concept(s). The query context is then exploited in the patient records query expansion and re-ranking for improving clinical retrieval performance. We evaluate this approach on the TREC Medical Records dataset. Results show that our proposed approach significantly improves the retrieval performance compare to classic keyword-based IR model

    A survey on the use of relevance feedback for information access systems

    Get PDF
    Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems

    Probability models for information retrieval based on divergence from randomness

    Get PDF
    This thesis devises a novel methodology based on probability theory, suitable for the construction of term-weighting models of Information Retrieval. Our term-weighting functions are created within a general framework made up of three components. Each of the three components is built independently from the others. We obtain the term-weighting functions from the general model in a purely theoretic way instantiating each component with different probability distribution forms. The thesis begins with investigating the nature of the statistical inference involved in Information Retrieval. We explore the estimation problem underlying the process of sampling. De Finetti’s theorem is used to show how to convert the frequentist approach into Bayesian inference and we display and employ the derived estimation techniques in the context of Information Retrieval. We initially pay a great attention to the construction of the basic sample spaces of Information Retrieval. The notion of single or multiple sampling from different populations in the context of Information Retrieval is extensively discussed and used through-out the thesis. The language modelling approach and the standard probabilistic model are studied under the same foundational view and are experimentally compared to the divergence-from-randomness approach. In revisiting the main information retrieval models in the literature, we show that even language modelling approach can be exploited to assign term-frequency normalization to the models of divergence from randomness. We finally introduce a novel framework for the query expansion. This framework is based on the models of divergence-from-randomness and it can be applied to arbitrary models of IR, divergence-based, language modelling and probabilistic models included. We have done a very large number of experiment and results show that the framework generates highly effective Information Retrieval models

    THE APPLICATION OF SEMANTIC INFORMATION CONTAINED IN RELEVANCE FEEDBACK IN THE ENHANCEMENT OF DOCUMENT RE-RANKING

    Get PDF
    Easily accessed publishing channels have resulted in the problem of information overload. Conventional information retrieval models, such as the vector model or the probability model, apply the lexical information contained in relevance feedback in the enhancement of document re-ranking. Improvement is possible considering the application of semantic information. Studies have been taking the approach of concept extraction and application in the dealing with this semantic matter. So far, a perfect solution remains elusive and research still has new ground to cover. As such, we have proposed and tested a strategic method to form a more understanding of this field of study. The results of formal tests show that the proposed method is more effective than the baseline ranking model

    A study of relevance feedback in vector space model

    Full text link
    Information Retrieval is the science of searching for information or documents based on information need from a huge set of documents. It has been an active field of research since early 19th century and different models of retrieval came in to existence to cater the information need. This thesis starts with understanding some of the basic information retrieval models, followed by implementation of one of the most popular statistical retrieval model known as Vector Space Model. This model ranks the documents in the collection based on the similarity measure calculated between the query and the respective document. The user specifies the information need which is more commonly known as a query using the visual interface provided. The given query is then processed and the results are displayed to the user in a ranked order. We then focus on the Relevance feedback, a technique that modifies the user query based on the characteristics of the document collection to improve the results. In this thesis, we explore different types and models of relevance feedback that can be applied to Vector Space model and how they affect the performance of the model

    Using biased support vector machine in image retrieval with self-organizing map.

    Get PDF
    Chan Chi Hang.Thesis submitted in: August 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 105-114).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Statement --- p.3Chapter 1.2 --- Major Contributions --- p.5Chapter 1.3 --- Publication List --- p.6Chapter 1.4 --- Thesis Organization --- p.7Chapter 2 --- Background Survey --- p.9Chapter 2.1 --- Relevance Feedback Framework --- p.9Chapter 2.1.1 --- Relevance Feedback Types --- p.11Chapter 2.1.2 --- Data Distribution --- p.12Chapter 2.1.3 --- Training Set Size --- p.14Chapter 2.1.4 --- Inter-Query Learning and Intra-Query Learning --- p.15Chapter 2.2 --- History of Relevance Feedback Techniques --- p.16Chapter 2.3 --- Relevance Feedback Approaches --- p.19Chapter 2.3.1 --- Vector Space Model --- p.19Chapter 2.3.2 --- Ad-hoc Re-weighting --- p.26Chapter 2.3.3 --- Distance Optimization Approach --- p.29Chapter 2.3.4 --- Probabilistic Model --- p.33Chapter 2.3.5 --- Bayesian Approach --- p.39Chapter 2.3.6 --- Density Estimation Approach --- p.42Chapter 2.3.7 --- Support Vector Machine --- p.48Chapter 2.4 --- Presentation Set Selection --- p.52Chapter 2.4.1 --- Most-probable strategy --- p.52Chapter 2.4.2 --- Most-informative strategy --- p.52Chapter 3 --- Biased Support Vector Machine for Content-Based Image Retrieval --- p.57Chapter 3.1 --- Motivation --- p.57Chapter 3.2 --- Background --- p.58Chapter 3.2.1 --- Regular Support Vector Machine --- p.59Chapter 3.2.2 --- One-class Support Vector Machine --- p.61Chapter 3.3 --- Biased Support Vector Machine --- p.63Chapter 3.4 --- Interpretation of parameters in BSVM --- p.67Chapter 3.5 --- Soft Label Biased Support Vector Machine --- p.69Chapter 3.6 --- Interpretation of parameters in Soft Label BSVM --- p.73Chapter 3.7 --- Relevance Feedback Using Biased Support Vector Machine --- p.74Chapter 3.7.1 --- Advantages of BSVM in Relevance Feedback . . --- p.74Chapter 3.7.2 --- Relevance Feedback Algorithm By BSVM --- p.75Chapter 3.8 --- Experiments --- p.78Chapter 3.8.1 --- Synthetic Dataset --- p.80Chapter 3.8.2 --- Real-World Dataset --- p.81Chapter 3.8.3 --- Experimental Results --- p.83Chapter 3.9 --- Conclusion --- p.86Chapter 4 --- Self-Organizing Map-based Inter-Query Learning --- p.88Chapter 4.1 --- Motivation --- p.88Chapter 4.2 --- Algorithm --- p.89Chapter 4.2.1 --- Initialization and Replication of SOM --- p.89Chapter 4.2.2 --- SOM Training for Inter-Query Learning --- p.90Chapter 4.2.3 --- Incorporate with Intra-Query Learning --- p.92Chapter 4.3 --- Experiments --- p.93Chapter 4.3.1 --- Synthetic Dataset --- p.95Chapter 4.3.2 --- Real-World Dataset --- p.95Chapter 4.3.3 --- Experimental Results --- p.97Chapter 4.4 --- Conclusion --- p.98Chapter 5 --- Conclusion --- p.102Bibliography --- p.10
    corecore