229 research outputs found

    A Critical Look at the Evaluation of Knowledge Graph Question Answering

    Get PDF
    PhD thesis in Information technologyThe field of information retrieval (IR) is concerned with systems that “make a given stored collection of information items available to a user population” [111]. The way in which information is made available to the user depends on the formulation of this broad concern of IR into specific tasks by which a system should address a user’s information need [85]. The specific IR task also dictates how the user may express their information need. The classic IR task is ad hoc retrieval, where the user issues a query to the system and gets in return a list of documents ranked by estimated relevance of each document to the query [85]. However, it has long been acknowledged that users are often looking for answers to questions, rather than an entire document or ranked list of documents [17, 141]. Question answering (QA) is thus another IR task; it comes in many flavors, but overall consists of taking in a user’s natural language (NL) question and returning an answer. This thesis describes work done within the scope of the QA task. The flavor of QA called knowledge graph question answering (KGQA) is taken as the primary focus, which enables QA with factual questions against structured data in the form of a knowledge graph (KG). This means the KGQA system addresses a structured representation of knowledge rather than—as in other QA flavors—an unstructured prose context. KGs have the benefit that given some identified entities or predicates, all associated properties are available and relationships can be utilized. KGQA then enables users to access structured data using only NL questions and without requiring formal query language expertise. Even so, the construction of satisfactory KGQA systems remains a challenge. Machine learning with deep neural networks (DNNs) is a far more promising approach than manually engineering retrieval models [29, 56, 130]. The current era dominated by DNNs began with seminal work on computer vision, where the deep learning paradigm demonstrated its first cases of “superhuman” performance [32, 71]. Subsequent work in other applications has also demonstrated “superhuman” performance with DNNs [58, 87]. As a result of its early position and hence longer history as a leading application of deep learning, computer vision with DNNs has been bolstered with much work on different approaches towards augmenting [120] or synthesizing [94] additional training data. The difficulty with machine learning approaches to KGQA appears to rest in large part with the limited volume, quality, and variety of available datasets for this task. Compared to labeled image data for computer vision, the problems of data collection, augmentation, and synthesis are only to a limited extent solved for QA, and especially for KGQA. There are few datasets for KGQA overall, and little previous work that has found unsupervised or semi-supervised learning approaches to address the sparsity of data. Instead, neural network approaches to KGQA rely on either fully or weakly supervised learning [29]. We are thus concerned with neural models trained in a supervised setting to perform QA tasks, especially of the KGQA flavor. Given a clear task to delegate to a computational system, it seems clear that we want the task performed as well as possible. However, what methodological elements are important to ensure good system performance within the chosen scope? How should the quality of system performance be assessed? This thesis describes work done to address these overarching questions through a number of more specific research questions. Altogether, we designate the topic of this thesis as KGQA evaluation, which we address in a broad sense, encompassing four subtopics from (1) the impact on performance due to volume of training data provided and (2) the information leakage between training and test splits due to unhygienic data partitioning, through (3) the naturalness of NL questions resulting from a common approach for generating KGQA datasets, to (4) the axiomatic analysis and development of evaluation measures for a specific flavor of the KGQA task. Each of the four subtopics is informed by previous work, but we aim in this thesis to critically examine the assumptions of previous work to uncover, verify, or address weaknesses in current practices surrounding KGQA evaluation

    A Defense of Pure Connectionism

    Full text link
    Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production. Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association

    Unsupervised structure induction and multimodal grounding

    Get PDF
    Structured representations build upon symbolic abstraction (e.g., words in natural language and visual concepts in natural images), offer a principled way of encoding our perceptions about the physical world, and enable the human-like generalization of machine learning systems. The predominant paradigm for learning structured representations of the observed data has been supervised learning, but it is limited in several respects. First, supervised learning is challenging given the scarcity of labeled data. Second, conventional approaches to structured prediction have been relying on a single modality (e.g., either images or text), ignoring the learning cues that may have been specified in and can be readily obtained from other modalities of data. In this thesis, we investigate unsupervised approaches to structure induction in a multimodal setting. Unsupervised learning is inherently difficult in general, let alone inducing complex and discrete structures from data without direct supervision. By considering the multimodal setting, we leverage the alignments between different data modalities (e.g., text, audio, and images) to facilitate the learning of structure-induction models, e.g., knowing that the individual words in ``a white pigeon'' always appear with the same visual object, a language parser is likely to treat them as a whole (i.e., phrase). The multimodal learning setting is practically viable because multimodal alignments are generally abundant. For example, they can be found in online posts such as news and tweets that usually contain images and associated text, and in (YouTube) videos, where audio, scripts, and scenes are synchronized and grounded in each other. We develop structure-induction models, which are capable of exploiting bimodal image-text alignments, for two modalities: (1) for natural language, we consider unsupervised syntactic parsing with phrase-structure grammars and regularize the parser by using visual image groundings; and (2) for visual images, we induce scene graph representations by mapping arguments and predicates in the text to their visual counterparts (i.e., visual objects and relations among them) in an unsupervised manner. While useful, crossmodal alignments are not always abundantly available on the web, e.g., the alignments between non-speech audio and text. We tackle the challenge by sharing the visual modality between image-text alignment and image-audio alignment; images function as a pivot and connect audio and text. The contributions of this thesis span from model development to data collection. We demonstrated the feasibility of applying multimodal learning techniques to unsupervised structure induction and multimodal alignment collection. Our work opens up new avenues for multimodal and unsupervised structured representation learning

    A Promethean Philosophy of External Technologies, Empiricism, & the Concept: Second-Order Cybernetics, Deep Learning, and Predictive Processing

    Get PDF
    Beginning with a survey of the shortcoming of theories of organology/media-as-externalization of mind/body—a philosophical-anthropological tradition that stretches from Plato through Ernst Kapp and finds its contemporary proponent in Bernard Stiegler—I propose that the phenomenological treatment of media as an outpouching and extension of mind qua intentionality is not sufficient to counter the ̳black-box‘ mystification of today‘s deep learning‘s algorithms. Focusing on a close study of Simondon‘s On the Existence of Technical Objectsand Individuation, I argue that the process-philosophical work of Gilbert Simondon, with its critique of Norbert Wiener‘s first-order cybernetics, offers a precursor to the conception of second-order cybernetics (as endorsed byFrancisco Varela, Humberto Maturana, and Ricardo B. Uribe) and, specifically, its autopoietic treatment of information. It has been argued by those such as Frank Pasquale that neuro-inferential deep learning systems premised on predictive patterning, suchas AlphaGo Zero, have a veiled logic and, thus, are ̳black boxes‘. In detailing a philosophical-historical approach to demystify predictive patterning/processing and the logic of such deep learning algorithms, this paper attempts to shine a light on such systems and their inner workingsàla Simondon

    Physics Avoidance & Cooperative Semantics: Inferentialism and Mark Wilson’s Engagement with Naturalism Qua Applied Mathematics

    Get PDF
    Mark Wilson argues that the standard categorizations of "Theory T thinking"— logic-centered conceptions of scientific organization (canonized via logical empiricists in the mid-twentieth century)—dampens the understanding and appreciation of those strategic subtleties working within science. By "Theory T thinking," we mean to describe the simplistic methodology in which mathematical science allegedly supplies ‘processes’ that parallel nature's own in a tidily isomorphic fashion, wherein "Theory T’s" feigned rigor and methodological dogmas advance inadequate discrimination that fails to distinguish between explanatory structures that are architecturally distinct. One of Wilson's main goals is to reverse such premature exclusions and, thus, early on Wilson returns to John Locke's original physical concerns regarding material science and the congeries of descriptive concern insofar as capturing varied phenomena (i.e., cohesion, elasticity, fracture, and the transmission of coherent work) encountered amongst ordinary solids like wood and steel are concerned. Of course, Wilson methodologically updates such a purview by appealing to multiscalar techniques of modern computing, drawing from Robert Batterman's work on the greediness of scales and Jim Woodward's insights on causation

    A Survey of AI Music Generation Tools and Models

    Full text link
    In this work, we provide a comprehensive survey of AI music generation tools, including both research projects and commercialized applications. To conduct our analysis, we classified music generation approaches into three categories: parameter-based, text-based, and visual-based classes. Our survey highlights the diverse possibilities and functional features of these tools, which cater to a wide range of users, from regular listeners to professional musicians. We observed that each tool has its own set of advantages and limitations. As a result, we have compiled a comprehensive list of these factors that should be considered during the tool selection process. Moreover, our survey offers critical insights into the underlying mechanisms and challenges of AI music generation

    Cracking-Resistant Password Vaults using Natural Language Encoders

    Get PDF
    Password vaults are increasingly popular applications that store multiple passwords encrypted under a single master password that the user memorizes. A password vault can greatly reduce the burden on a user of remembering passwords, but introduces a single point of failure. An attacker that obtains a user’s encrypted vault can mount offline brute-force attacks and, if successful, compromise all of the passwords in the vault. In this paper, we investigate the construction of encrypted vaults that resist such offline cracking attacks and force attackers instead to mount online attacks. Our contributions are as follows. We present an attack and supporting analysis showing that a previous design for cracking-resistant vaults—the only one of which we are aware—actually degrades security relative to conventional password-based approaches. We then introduce a new type of secure encoding scheme that we call a natural language encoder (NLE). An NLE permits the construction of vaults which, when decrypted with the wrong master password, produce plausible-looking decoy passwords. We show how to build NLEs using existing tools from natural language processing, such as n-gram models and probabilistic context-free grammars, and evaluate their ability to generate plausible decoys. Finally, we present, implement, and evaluate a full, NLE-based cracking-resistant vault system called NoCrack
    • …
    corecore