2,893 research outputs found

    Incorporating inductive biases into machine learning algorithms

    Get PDF
    Recently, significant advances in artificial intelligence (AI) have surpassed what was imaginable even five years ago. Today, we can instruct diffusion-based models to generate high-quality videos from human descriptions or prompt large language models (LLMs) to assist with writing, translation, and even mathematical reasoning. These remarkable abilities arise from training massive deep-learning models on huge amounts of data. However, we do not always have enough data. In some tasks, such as mathematical reasoning or molecule generation, available data are very limited. Furthermore, despite current LLMs utilizing nearly all available data on the Internet, they remain imperfect. Thus, it is a critical question how to enhance the performance of AI systems when it is difficult to increase the amount of training data. In this thesis, we address this challenge from the perspective of inductive biases. Specifically, we investigate how to effectively use human knowledge about data or tasks to optimize the behavior of a machine learning algorithm, without requiring extra data. We will first give a brief review of research on inductive biases, and then we will show how to incorporate inductive biases during structure designing, training, and inference of a machine learning model, respectively. We also performed extensive experiments demonstrating that incorporating appropriate inductive biases can greatly boost model performance on a variety of tasks without the need for additional data

    SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

    Full text link
    The recent progress in large language models (LLMs), especially the invention of chain-of-thoughts (CoT) prompting, makes it possible to solve reasoning problems. However, even the strongest LLMs are still struggling with more complicated problems that require non-linear thinking and multi-step reasoning. In this work, we explore whether LLMs have the ability to recognize their own errors, without resorting to external resources. In particular, we investigate whether they can be used to identify individual errors within a step-by-step reasoning. To this end, we propose a zero-shot verification scheme to recognize such errors. We then use this verification scheme to improve question-answering performance, by using it to perform weighted voting on different generated answers. We test the method on three math datasets-GSM8K, MathQA, and MATH-and find that it successfully recognizes errors and, in turn, increases final predictive performance

    CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

    Full text link
    In real-world applications of natural language generation, there are often constraints on the target sentences in addition to fluency and naturalness requirements. Existing language generation techniques are usually based on recurrent neural networks (RNNs). However, it is non-trivial to impose constraints on RNNs while maintaining generation quality, since RNNs generate sentences sequentially (or with beam search) from the first word to the last. In this paper, we propose CGMH, a novel approach using Metropolis-Hastings sampling for constrained sentence generation. CGMH allows complicated constraints such as the occurrence of multiple keywords in the target sentences, which cannot be handled in traditional RNN-based approaches. Moreover, CGMH works in the inference stage, and does not require parallel corpora for training. We evaluate our method on a variety of tasks, including keywords-to-sentence generation, unsupervised sentence paraphrasing, and unsupervised sentence error correction. CGMH achieves high performance compared with previous supervised methods for sentence generation. Our code is released at https://github.com/NingMiao/CGMHComment: AAAI1

    Physiological responses of mycorrhizal symbiosis to drought stress in white clover

    Get PDF
    The aim of the present study was to analyze the effects of two arbuscular mycorrhizal fungi (AMF), Funneliformis mosseae and Paraglomus occultum, on leaf water status, root morphology, root sugar accumulation, root abscisic acid (ABA) levels, root malondialdehyde (MDA) content, and root antioxidant enzyme activities in white clover (Trifolium repens L.) exposed to well-watered (WW) and drought stress (DS) conditions. The results showed that root colonization by F. mosseae and P. occultum was significantly decreased by 7-week soil drought treatment. Under drought stress conditions, mycorrhizal fungal treatment considerably stimulated root total length, surface area and volume, as compared with non-mycorrhizal controls. In addition, inoculation with arbuscular mycorrhizal fungi also increased leaf relative water content and accelerated the accumulation of root glucose and fructose under drought stress. Mycorrhizal plants under drought stress registered higher activities of superoxide dismutase (SOD), catalase (CAT), and peroxidase (POD) and ABA levels in roots, while lower MDA contents, relative to non-mycorrhizal plants. As a result, mycorrhiza-inoculated plants represented better physiological activities (e.g. antioxidant defense systems, root morphology, and sugar accumulation) than non-inoculated plants in response to soil drought, whilst P. occultum had superior effects than F. mosseae
    • …
    corecore