5,042 research outputs found

    Explainable Information Retrieval: A Survey

    Full text link
    Explainable information retrieval is an emerging research area aiming to make transparent and trustworthy information retrieval systems. Given the increasing use of complex machine learning models in search systems, explainability is essential in building and auditing responsible information retrieval models. This survey fills a vital gap in the otherwise topically diverse literature of explainable information retrieval. It categorizes and discusses recent explainability methods developed for different application domains in information retrieval, providing a common framework and unifying perspectives. In addition, it reflects on the common concern of evaluating explanations and highlights open challenges and opportunities.Comment: 35 pages, 10 figures. Under revie

    Composing Measures for Computing Text Similarity

    Get PDF
    We present a comprehensive study of computing similarity between texts. We start from the observation that while the concept of similarity is well grounded in psychology, text similarity is much less well-defined in the natural language processing community. We thus define the notion of text similarity and distinguish it from related tasks such as textual entailment and near-duplicate detection. We then identify multiple text dimensions, i.e. characteristics inherent to texts that can be used to judge text similarity, for which we provide empirical evidence. We discuss state-of-the-art text similarity measures previously proposed in the literature, before continuing with a thorough discussion of common evaluation metrics and datasets. Based on the analysis, we devise an architecture which combines text similarity measures in a unified classification framework. We apply our system in two evaluation settings, for which it consistently outperforms prior work and competing systems: (a) an intrinsic evaluation in the context of the Semantic Textual Similarity Task as part of the Semantic Evaluation (SemEval) exercises, and (b) an extrinsic evaluation for the detection of text reuse. As a basis for future work, we introduce DKPro Similarity, an open source software package which streamlines the development of text similarity measures and complete experimental setups

    Complexity over Uncertainty in Generalized Representational\ud Information Theory (GRIT): A Structure-Sensitive General\ud Theory of Information

    Get PDF
    What is information? Although researchers have used the construct of information liberally to refer to pertinent forms of domain-specific knowledge, relatively few have attempted to generalize and standardize the construct. Shannon and Weaver(1949)offered the best known attempt at a quantitative generalization in terms of the number of discriminable symbols required to communicate the state of an uncertain event. This idea, although useful, does not capture the role that structural context and complexity play in the process of understanding an event as being informative. In what follows, we discuss the limitations and futility of any generalization (and particularly, Shannonā€™s) that is not based on the way that agents extract patterns from their environment. More specifically, we shall argue that agent concept acquisition, and not the communication of\ud states of uncertainty, lie at the heart of generalized information, and that the best way of characterizing information is via the relative gain or loss in concept complexity that is experienced when a set of known entities (regardless of their nature or domain of origin) changes. We show that Representational Information Theory perfectly captures this crucial aspect of information and conclude with the first generalization of Representational Information Theory (RIT) to continuous domains

    The use of data-mining for the automatic formation of tactics

    Get PDF
    This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques
    • ā€¦
    corecore