53,457 research outputs found

    Contrastive Learning for Inference in Dialogue

    Full text link
    Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advances in inference tasks, their performance in inductive reasoning, where not all information is present in the context, is far behind deductive reasoning. In this paper, we analyze the behavior of the models based on the task difficulty defined by the semantic information gap -- which distinguishes inductive and deductive reasoning (Johnson-Laird, 1988, 1993). Our analysis reveals that the disparity in information between dialogue contexts and desired inferences poses a significant challenge to the inductive inference process. To mitigate this information gap, we investigate a contrastive learning approach by feeding negative samples. Our experiments suggest negative samples help models understand what is wrong and improve their inference generations.Comment: Accepted to EMNLP202

    More is more in language learning:reconsidering the less-is-more hypothesis

    Get PDF
    The Less-is-More hypothesis was proposed to explain age-of-acquisition effects in first language (L1) acquisition and second language (L2) attainment. We scrutinize different renditions of the hypothesis by examining how learning outcomes are affected by (1) limited cognitive capacity, (2) reduced interference resulting from less prior knowledge, and (3) simplified language input. While there is little-to-no evidence of benefits of limited cognitive capacity, there is ample support for a More-is-More account linking enhanced capacity with better L1- and L2-learning outcomes, and reduced capacity with childhood language disorders. Instead, reduced prior knowledge (relative to adults) may afford children with greater flexibility in inductive inference; this contradicts the idea that children benefit from a more constrained hypothesis space. Finally, studies of childdirected speech (CDS) confirm benefits from less complex input at early stages, but also emphasize how greater lexical and syntactic complexity of the input confers benefits in L1-attainment

    Combining Models of Approximation with Partial Learning

    Full text link
    In Gold's framework of inductive inference, the model of partial learning requires the learner to output exactly one correct index for the target object and only the target object infinitely often. Since infinitely many of the learner's hypotheses may be incorrect, it is not obvious whether a partial learner can be modifed to "approximate" the target object. Fulk and Jain (Approximate inference and scientific method. Information and Computation 114(2):179--191, 1994) introduced a model of approximate learning of recursive functions. The present work extends their research and solves an open problem of Fulk and Jain by showing that there is a learner which approximates and partially identifies every recursive function by outputting a sequence of hypotheses which, in addition, are also almost all finite variants of the target function. The subsequent study is dedicated to the question how these findings generalise to the learning of r.e. languages from positive data. Here three variants of approximate learning will be introduced and investigated with respect to the question whether they can be combined with partial learning. Following the line of Fulk and Jain's research, further investigations provide conditions under which partial language learners can eventually output only finite variants of the target language. The combinabilities of other partial learning criteria will also be briefly studied.Comment: 28 page

    Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

    Full text link
    Proprietary and closed APIs are becoming increasingly common to process natural language, and are impacting the practical applications of natural language processing, including few-shot classification. Few-shot classification involves training a model to perform a new classification task with a handful of labeled data. This paper presents three contributions. First, we introduce a scenario where the embedding of a pre-trained model is served through a gated API with compute-cost and data-privacy constraints. Second, we propose a transductive inference, a learning paradigm that has been overlooked by the NLP community. Transductive inference, unlike traditional inductive learning, leverages the statistics of unlabeled data. We also introduce a new parameter-free transductive regularizer based on the Fisher-Rao loss, which can be used on top of the gated API embeddings. This method fully utilizes unlabeled data, does not share any label with the third-party API provider and could serve as a baseline for future research. Third, we propose an improved experimental setting and compile a benchmark of eight datasets involving multiclass classification in four different languages, with up to 151 classes. We evaluate our methods using eight backbone models, along with an episodic evaluation over 1,000 episodes, which demonstrate the superiority of transductive inference over the standard inductive setting.Comment: EMNLP 202

    Human-like Few-Shot Learning via Bayesian Reasoning over Natural Language

    Full text link
    A core tension in models of concept learning is that the model must carefully balance the tractability of inference against the expressivity of the hypothesis class. Humans, however, can efficiently learn a broad range of concepts. We introduce a model of inductive learning that seeks to be human-like in that sense. It implements a Bayesian reasoning process where a language model first proposes candidate hypotheses expressed in natural language, which are then re-weighed by a prior and a likelihood. By estimating the prior from human data, we can predict human judgments on learning problems involving numbers and sets, spanning concepts that are generative, discriminative, propositional, and higher-order.Comment: NeurIPS 2023 ora

    A comparative analysis of the effects of teaching writing in a foreign language with the application of the deductive and the inductive approach

    Get PDF
    The aim of this paper is to present and analyse the results of the study which focused on measuring the effectiveness of the deductive and inductive approach in teaching writing in a foreign language. The aim will be achieved through the introduction of a relevant theoretical background, the presentation of the research design, a brief description of the research and finally the presentation and analysis of the outcomes

    Theory completion using inverse entailment

    Get PDF
    The main real-world applications of Inductive Logic Programming (ILP) to date involve the "Observation Predicate Learning" (OPL) assumption, in which both the examples and hypotheses define the same predicate. However, in both scientific discovery and language learning potential applications exist in which OPL does not hold. OPL is ingrained within the theory and performance testing of Machine Learning. A general ILP technique called "Theory Completion using Inverse Entailment" (TCIE) is introduced which is applicable to non-OPL applications. TCIE is based on inverse entailment and is closely allied to abductive inference. The implementation of TCIE within Progol5.0 is described. The implementation uses contra-positives in a similar way to Stickel's Prolog Technology Theorem Prover. Progol5.0 is tested on two different data-sets. The first dataset involves a grammar which translates numbers to their representation in English. The second dataset involves hypothesising the function of unknown genes within a network of metabolic pathways. On both datasets near complete recovery of performance is achieved after relearning when randomly chosen portions of background knowledge are removed. Progol5.0's running times for experiments in this paper were typically under 6 seconds on a standard laptop PC
    corecore