3,323 research outputs found
A survey on the use of relevance feedback for information access systems
Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems
Recommended from our members
Investigating ontology based query expansion using a probabilistic retrieval model
This research briefly outlines the problems of traditional information retrieval systems and discusses the different approaches to inferring context in document retrieval. By context we mean word disambiguation which is achieved by exploring the generalisation-specialisation hierarchies within a given ontology. Specifically, we examine the use of ontology based query expansion for defining query context. Query expansion can be done in many ways and in this work we consider the use of relevance feedback and pseudo-relevance feedback for query expansion. We examine relevance feedback and pseudo-relevance to ascertain the existence of performance differences between relevance feedback and pseudo-relevance feedback. The information retrieval system used is based on the probabilistic retrieval model and the query expansion method is extended using information from a news domain ontology. The aim of this project is to assess the impact of the use of the ontology on the query expansion results. Our results show that ontology based query expansion has resulted in a higher number of relevant documents being retrieved compared to the standard relevance feedback process. Overall, ontology based query expansion improves recall but does not produce any significant improvements for the precision results. Pseudo-relevance feedback has achieved better results than relevance feedback. We also found that reducing or increasing the relevance feedback parameters (number of terms or number of documents) does not correlate with the results. When comparing the effect of varying the number of terms parameter with the number of documents parameter, the former benefits the pseudo-relevance feedback results but the latter has an additional effect on the relevance feedback results. There are many factors which influence the success of ontology based query expansion. The thesis discusses these factors and gives some guidelines on using ontologies for the purpose of query expansion
Ontology driven information retrieval.
Ontology-driven information retrieval deals with the use of entities specified in domain ontologies to enhance search and browse. The entities or concepts of lightweight ontological resources are traditionally used to index resources in specialised domains. Indexing with concepts is often achieved manually and reusing them to enhance search remains a challenge. Other challenges range from the difficulty in merging multiple ontologies for use in retrieval to the problem of integrating concept-based search into existing search systems. We mainly encounter these challenges in enterprise search environments, which have not kept pace with Web search engines and mostly rely on full-text search systems. Full-text search systems are keyword-based and suffer from well-known vocabulary mismatch problems. Ontologies model domain knowledge and have the potential for use in understanding the unstructured content of documents. In this thesis, we investigate the challenges of using domain ontologies for enhancing search in enterprise systems. Firstly, we investigate methods for annotating documents by identifying the best concepts that represent their contents. We explore ways to overcome the challenges of insufficient textual features in lightweight ontologies and introduce an unsupervised method for annotating documents based on generating concept descriptors from external resources. Specifically, we augment concepts with descriptive textual content by exploiting the taxonomic structure of an ontology to ensure that we generate useful descriptors. Secondly, the need often arises for cross-ontology reasoning when using multiple ontologies in ontology-driven search. Once again, we attempt to overcome the absence of rich features in lightweight ontologies by exploring the use of background knowledge for the alignment process. We propose novel ontology alignment techniques which integrate string metrics, semantic features, and term weights for discovering diverse correspondence types in supervised and unsupervised ontology alignment. Thirdly, we investigate different representational schemes for queries and documents and explore semantic ranking models using conceptual representations. Accordingly, we propose a semantic ranking model that incorporates the knowledge of concept relatedness and a predictive model to apply semantic ranking only when it is deemed beneficial for retrieval. Finally, we conduct comprehensive evaluations of the proposed methods and discuss our findings
Semantically en enhanced information retrieval: an ontology-based aprroach
Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, enero de 2009Bibliogr.: [227]-240 p
Mining the Medical and Patent Literature to Support Healthcare and Pharmacovigilance
Recent advancements in healthcare practices and the increasing use of information technology in the medical domain has lead to the rapid generation of free-text data in forms of scientific articles, e-health records, patents, and document inventories. This has urged the development of sophisticated information retrieval and information extraction technologies. A fundamental requirement for the automatic processing of biomedical text is the identification of information carrying units such as the concepts or named entities. In this context, this work focuses on the identification of medical disorders (such as diseases and adverse effects) which denote an important category of concepts in the medical text. Two methodologies were investigated in this regard and they are dictionary-based and machine learning-based approaches. Futhermore, the capabilities of the concept recognition techniques were systematically exploited to build a semantic search platform for the retrieval of e-health records and patents. The system facilitates conventional text search as well as semantic and ontological searches. Performance of the adapted retrieval platform for e-health records and patents was evaluated within open assessment challenges (i.e. TRECMED and TRECCHEM respectively) wherein the system was best rated in comparison to several other competing information retrieval platforms. Finally, from the medico-pharma perspective, a strategy for the identification of adverse drug events from medical case reports was developed. Qualitative evaluation as well as an expert validation of the developed system's performance showed robust results. In conclusion, this thesis presents approaches for efficient information retrieval and information extraction from various biomedical literature sources in the support of healthcare and pharmacovigilance. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. This can promote the literature-based knowledge discovery, improve the safety and effectiveness of medical practices, and drive the research and development in medical and healthcare arena
Examining the Application of Modular and Contextualised Ontology in Query Expansions for Information Retrieval
This research considers the ongoing challenge of semantics-based search from the perspective of how to exploit Semantic Web languages for search in the current Web environment.
The purpose of the PhD was to use ontology-based query expansion (OQE) to improve search effectiveness by increasing search precision, i.e. retrieving relevant documents in the topmost ranked positions in a returned document list. Query experiments have required a novel search tool that can combine Semantic Web technologies in an otherwise traditional IR process using a Web document collection
Recommended from our members
PowerAqua: Open Question Answering on the Semantic Web
With the rapid growth of semantic information in the Web, the processes of searching and querying these very large amounts of heterogeneous content have become increasingly challenging. This research tackles the problem of supporting users in querying and exploring information across multiple and heterogeneous Semantic Web (SW) sources.
A review of literature on ontology-based Question Answering reveals the limitations of existing technology. Our approach is based on providing a natural language Question Answering interface for the SW, PowerAqua. The realization of PowerAqua represents a considerable advance with respect to other systems, which restrict their scope to an ontology-specific or homogeneous fraction of the publicly available SW content. To our knowledge, PowerAqua is the only system that is able to take advantage of the semantic data available on the Web to interpret and answer user queries posed in natural language. In particular, PowerAqua is uniquely able to answer queries by combining and aggregating information, which can be distributed across heterogeneous semantic resources.
Here, we provide a complete overview of our work on PowerAqua, including: the research challenges it addresses; its architecture; the techniques we have realised to map queries to semantic data, to integrate partial answers drawn from different semantic resources and to rank alternative answers; and the evaluation studies we have performed, to assess the performance of PowerAqua. We believe our experiences can be extrapolated to a variety of end-user applications that wish to open up to large scale and heterogeneous structured datasets, to be able to exploit effectively what possibly is the greatest wealth of data in the history of Artificial Intelligence
Formal concept matching and reinforcement learning in adaptive information retrieval
The superiority of the human brain in information retrieval (IR) tasks seems to come firstly
from its ability to read and understand the concepts, ideas or meanings central to documents, in
order to reason out the usefulness of documents to information needs, and secondly from its
ability to learn from experience and be adaptive to the environment. In this work we attempt to
incorporate these properties into the development of an IR model to improve document
retrieval. We investigate the applicability of concept lattices, which are based on the theory of
Formal Concept Analysis (FCA), to the representation of documents. This allows the use of
more elegant representation units, as opposed to keywords, in order to better capture
concepts/ideas expressed in natural language text. We also investigate the use of a
reinforcement leaming strategy to learn and improve document representations, based on the
information present in query statements and user relevance feedback. Features or concepts of
each document/query, formulated using FCA, are weighted separately with respect to the
documents they are in, and organised into separate concept lattices according to a subsumption
relation. Furthen-nore, each concept lattice is encoded in a two-layer neural network structure
known as a Bidirectional Associative Memory (BAM), for efficient manipulation of the
concepts in the lattice representation. This avoids implementation drawbacks faced by other
FCA-based approaches. Retrieval of a document for an information need is based on concept
matching between concept lattice representations of a document and a query. The learning
strategy works by making the similarity of relevant documents stronger and non-relevant
documents weaker for each query, depending on the relevance judgements of the users on
retrieved documents. Our approach is radically different to existing FCA-based approaches in
the following respects: concept formulation; weight assignment to object-attribute pairs; the
representation of each document in a separate concept lattice; and encoding concept lattices in
BAM structures. Furthermore, in contrast to the traditional relevance feedback mechanism, our
learning strategy makes use of relevance feedback information to enhance document
representations, thus making the document representations dynamic and adaptive to the user
interactions. The results obtained on the CISI, CACM and ASLIB Cranfield collections are
presented and compared with published results. In particular, the performance of the system is
shown to improve significantly as the system learns from experience.The School of Computing,
University of Plymouth, UK
- …