2,893 research outputs found
Incorporating inductive biases into machine learning algorithms
Recently, significant advances in artificial intelligence (AI) have surpassed what was imaginable even five years ago. Today, we can instruct diffusion-based models to generate high-quality videos from human descriptions or prompt large language models (LLMs) to assist with writing, translation, and even mathematical reasoning. These remarkable abilities arise from training massive deep-learning models on huge amounts of data. However, we do not always have enough data. In some tasks, such as mathematical reasoning or molecule generation, available data are very limited. Furthermore, despite current LLMs utilizing nearly all available data on the Internet, they remain imperfect. Thus, it is a critical question how to enhance the performance of AI systems when it is difficult to increase the amount of training data.
In this thesis, we address this challenge from the perspective of inductive biases. Specifically, we investigate how to effectively use human knowledge about data or tasks to optimize the behavior of a machine learning algorithm, without requiring extra data. We will first give a brief review of research on inductive biases, and then we will show how to incorporate inductive biases during structure designing, training, and inference of a machine learning model, respectively. We also performed extensive experiments demonstrating that incorporating appropriate inductive biases can greatly boost model performance on a variety of tasks without the need for additional data
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
The recent progress in large language models (LLMs), especially the invention
of chain-of-thoughts (CoT) prompting, makes it possible to solve reasoning
problems. However, even the strongest LLMs are still struggling with more
complicated problems that require non-linear thinking and multi-step reasoning.
In this work, we explore whether LLMs have the ability to recognize their own
errors, without resorting to external resources. In particular, we investigate
whether they can be used to identify individual errors within a step-by-step
reasoning. To this end, we propose a zero-shot verification scheme to recognize
such errors. We then use this verification scheme to improve question-answering
performance, by using it to perform weighted voting on different generated
answers. We test the method on three math datasets-GSM8K, MathQA, and MATH-and
find that it successfully recognizes errors and, in turn, increases final
predictive performance
CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling
In real-world applications of natural language generation, there are often
constraints on the target sentences in addition to fluency and naturalness
requirements. Existing language generation techniques are usually based on
recurrent neural networks (RNNs). However, it is non-trivial to impose
constraints on RNNs while maintaining generation quality, since RNNs generate
sentences sequentially (or with beam search) from the first word to the last.
In this paper, we propose CGMH, a novel approach using Metropolis-Hastings
sampling for constrained sentence generation. CGMH allows complicated
constraints such as the occurrence of multiple keywords in the target
sentences, which cannot be handled in traditional RNN-based approaches.
Moreover, CGMH works in the inference stage, and does not require parallel
corpora for training. We evaluate our method on a variety of tasks, including
keywords-to-sentence generation, unsupervised sentence paraphrasing, and
unsupervised sentence error correction. CGMH achieves high performance compared
with previous supervised methods for sentence generation. Our code is released
at https://github.com/NingMiao/CGMHComment: AAAI1
Physiological responses of mycorrhizal symbiosis to drought stress in white clover
The aim of the present study was to analyze the effects of two arbuscular mycorrhizal fungi (AMF), Funneliformis mosseae and Paraglomus occultum, on leaf water status, root morphology, root sugar accumulation, root abscisic acid (ABA) levels, root malondialdehyde (MDA) content, and root antioxidant enzyme activities in white clover (Trifolium repens L.) exposed to well-watered (WW) and drought stress (DS) conditions. The results showed that root colonization by F. mosseae and P. occultum was significantly decreased by 7-week soil drought treatment. Under drought stress conditions, mycorrhizal fungal treatment considerably stimulated root total length, surface area and volume, as compared with non-mycorrhizal controls. In addition, inoculation with arbuscular mycorrhizal fungi also increased leaf relative water content and accelerated the accumulation of root glucose and fructose under drought stress. Mycorrhizal plants under drought stress registered higher activities of superoxide dismutase (SOD), catalase (CAT), and peroxidase (POD) and ABA levels in roots, while lower MDA contents, relative to non-mycorrhizal plants. As a result, mycorrhiza-inoculated plants represented better physiological activities (e.g. antioxidant defense systems, root morphology, and sugar accumulation) than non-inoculated plants in response to soil drought, whilst P. occultum had superior effects than F. mosseae
- …