172 research outputs found

    Document ranking with quantum probabilities

    Get PDF
    In this thesis we investigate the use of quantum probability theory for ranking documents. Quantum probability theory is used to estimate the probability of relevance of a document given a user's query. We posit that quantum probability theory can lead to a better estimation of the probability of a document being relevant to a user's query than the common approach, i.e. the Probability Ranking Principle (PRP), which is based upon Kolmogorovian probability theory. Following our hypothesis, we formulate an analogy between the document retrieval scenario and a physical scenario, that of the double slit experiment. Through the analogy, we propose a novel ranking approach, the quantum probability ranking principle (qPRP). Key to our proposal is the presence of quantum interference. Mathematically, this is the statistical deviation between empirical observations and expected values predicted by the Kolmogorovian rule of additivity of probabilities of disjoint events in configurations such that of the double slit experiment. We propose an interpretation of quantum interference in the document ranking scenario, and examine how quantum interference can be effectively estimated for document retrieval. To validate our proposal and to gain more insights about approaches for document ranking, we (1) analyse PRP, qPRP and other ranking approaches, exposing the assumptions underlying their ranking criteria and formulating the conditions for the optimality of the two ranking principles, (2) empirically compare three ranking principles (i.e. PRP, interactive PRP, and qPRP) and two state-of-the-art ranking strategies in two retrieval scenarios, those of ad-hoc retrieval and diversity retrieval, (3) analytically contrast the ranking criteria of the examined approaches, exposing similarities and differences, (4) study the ranking behaviours of approaches alternative to PRP in terms of the kinematics they impose on relevant documents, i.e. by considering the extent and direction of the movements of relevant documents across the ranking recorded when comparing PRP against its alternatives. Our findings show that the effectiveness of the examined ranking approaches strongly depends upon the evaluation context. In the traditional evaluation context of ad-hoc retrieval, PRP is empirically shown to be better or comparable to alternative ranking approaches. However, when we turn to examine evaluation contexts that account for interdependent document relevance (i.e. when the relevance of a document is assessed also with respect to other retrieved documents, as it is the case in the diversity retrieval scenario) then the use of quantum probability theory and thus of qPRP is shown to improve retrieval and ranking effectiveness over the traditional PRP and alternative ranking strategies, such as Maximal Marginal Relevance, Portfolio theory, and Interactive PRP. This work represents a significant step forward regarding the use of quantum theory in information retrieval. It demonstrates in fact that the application of quantum theory to problems within information retrieval can lead to improvements both in modelling power and retrieval effectiveness, allowing the constructions of models that capture the complexity of information retrieval situations. Furthermore, the thesis opens up a number of lines for future research. These include (1) investigating estimations and approximations of quantum interference in qPRP, (2) exploiting complex numbers for the representation of documents and queries, and (3) applying the concepts underlying qPRP to tasks other than document ranking

    Social tagging, guppy effect, and the role of interference: A quantum-inspired model for tags combination

    Get PDF
    Social tagging systems are shown to evidence a well known cognitive heuristic, the guppy effect, which arises from the combination of different concepts. We present some empirical evidence of this effect, drawn from a popular social tagging Web service. The guppy effect is then described using a quantum inspired formalism that has been already successfully applied to model conjunction fallacy and probability judgement errors. Key to the formalism is the concept of interference, which is able to capture and quantify the strength of the guppy effect

    Integrating understandability in the evaluation of consumer health search engines

    Get PDF
    In this paper we propose a method that integrates the no- tion of understandability, as a factor of document relevance, into the evaluation of information retrieval systems for con- sumer health search. We consider the gain-discount evaluation framework (RBP, nDCG, ERR) and propose two understandability-based variants (uRBP) of rank biased precision, characterised by an estimation of understandability based on document readability and by different models of how readability influences user understanding of document content. The proposed uRBP measures are empirically contrasted to RBP by comparing system rankings obtained with each measure. The findings suggest that considering understandability along with topicality in the evaluation of in- formation retrieval systems lead to different claims about systems effectiveness than considering topicality alone

    Advances in formal models of search and search behaviour

    Get PDF
    Searching is performed in the context of a task and as such the value of the information found is with respect to the task. Recently, there has been a drive to developing formal models of information seeking and retrieval that consider the costs and benefits arising through the interaction with the interface/system and the information surfaced during that interaction. In this full day tutorial we will focus on describing and explaining some of the more recent and latest formal models of Information Seeking and Retrieval. The tutorial is structured into two parts. In the first part we will present a series of models that have been developed based on: (i) economic theory, (ii) decision theory (iii) game theory and (iv) optimal foraging theory. The second part of the day will be dedicated to building models where we will discuss different techniques to build and develop models from which we can draw testable hypotheses from. During the tutorial participants will be challenged to develop various formals models, applying the techniques learnt during the day. We will then conclude with presentations on solutions followed by a summary and overview of challenges and future directions. This tutorial is aimed at participants wanting to know more about the various formal models of information seeking, search and retrieval, that have been proposed. The tutorial will be presented at an intermediate level, and is designed to support participants who want to be able to understand and build such models

    ChatGPT Hallucinates when Attributing Answers

    Full text link
    Can ChatGPT provide evidence to support its answers? Does the evidence it suggests actually exist and does it really support its answer? We investigate these questions using a collection of domain-specific knowledge-based questions, specifically prompting ChatGPT to provide both an answer and supporting evidence in the form of references to external sources. We also investigate how different prompts impact answers and evidence. We find that ChatGPT provides correct or partially correct answers in about half of the cases (50.6% of the times), but its suggested references only exist 14% of the times. We further provide insights on the generated references that reveal common traits among the references that ChatGPT generates, and show how even if a reference provided by the model does exist, this reference often does not support the claims ChatGPT attributes to it. Our findings are important because (1) they are the first systematic analysis of the references created by ChatGPT in its answers; (2) they suggest that the model may leverage good quality information in producing correct answers, but is unable to attribute real evidence to support its answers. Prompts, raw result files and manual analysis are made publicly available

    Collagen Fiber Re-Alignment and Uncrimping in Response to Loading: Determining Structure-Function Relationships Using a Developmental Tendon Mouse Model

    Get PDF
    Collagen fiber re-alignment and uncrimping are postulated mechanisms of structural response to load. It has been suggested that fibers re-orient in the direction of load and then uncrimp before collagen is tensioned and that in general, the structure is a result of the function tendons perform. However, little is known about how fiber re-alignment and uncrimping change in response to load, how this change relates to tendon mechanical properties, and if these changes are dependent on the underlying structure. Throughout postnatal development, dramatic structural and compositional changes occur in tendon. Postnatal tendons, with immature collagen networks, may respond to load in a different manner and timescale than mature collagen networks. Therefore, the overall objective of this study was to quantify the mechanical properties and structural response to load in a developmental mouse tendon model at 4, 10, 28 and 90 days old. Local collagen fiber re-alignment and crimp frequency were quantified throughout mechanical testing and local mechanical properties were measured. Throughout development, fiber re-alignment occurred at different points in the mechanical testing protocol. At early development, re-alignment was not identified until the linear (4 days) or toe-regions (10 days) of the mechanical test suggesting that fibers required a prolonged exposure to mechanical load before responding and that the immature collagen network present may delay re-alignment. The uncrimping of collagen fibers was identified during the toe-region of the mechanical test at all ages suggesting that crimp contributes to tendon nonlinear behavior. Additionally, results at 28 and 90 days suggested that collagen fiber crimp frequency decreased with increasing number of preconditioning cycles, which may affect toe-region properties. Mechanical properties and cross-sectional area increased throughout development. The insertion site demonstrated lower moduli values and a more disorganized fiber distribution compared to the midsubstance at all ages suggesting it experiences multi-axial loads. Further, the tendon locations demonstrated different re-alignment and crimp behaviors suggesting that locations may respond to load differently and develop at different rates. Results from this study suggest that structure affects the tendon\u27s ability to respond to load and that the loading protocol applied may affect the measurement of mechanical properties

    Choices in Knowledge-Base Retrieval for Consumer Health Search

    Get PDF
    This paper investigates how retrieval using knowledge bases can be effectively translated to the consumer health search (CHS) domain. We posit that using knowledge bases for query reformulation may help to overcome some of the challenges in CHS. However, translating and implementing such approaches is nontrivial in CHS as it involves many design choices. We empirically evaluated the impact these different choices had on retrieval effectiveness. A state-of-the-art knowledge-base retrieval model—the Entity Query Feature Expansion model—was used to evaluate the following design choices: which knowledge base to use (specialised vs. generic), how to construct the knowledge base, how to extract entities from queries and map them to entities in the knowledge base, what part of the knowledge base to use for query expansion, and if to augment the KB search process with relevance feedback. While knowledge base retrieval has been proposed as a solution for CHS, this paper delves into the finer details of doing this effectively, highlighting both pitfalls and payoffs. It aims to provide some lessons to others in advancing the state-of-the-art in CHS

    Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models

    Full text link
    As in other fields of artificial intelligence, the information retrieval community has grown interested in investigating the power consumption associated with neural models, particularly models of search. This interest has become particularly relevant as the energy consumption of information retrieval models has risen with new neural models based on large language models, leading to an associated increase of CO2 emissions, albeit relatively low compared to fields such as natural language processing
    • …
    corecore