126,026 research outputs found

    A Model of Inductive Bias Learning

    Full text link
    A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded within an environment of related learning tasks. Within such an environment the learner can sample from multiple tasks, and hence it can search for a hypothesis space that contains good solutions to many of the problems in the environment. Under certain restrictions on the set of all hypothesis spaces available to the learner, we show that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment. Explicit bounds are also derived demonstrating that learning multiple tasks within an environment of related tasks can potentially give much better generalization than learning a single task

    How to shift bias: Lessons from the Baldwin effect

    Get PDF
    An inductive learning algorithm takes a set of data as input and generates a hypothesis as output. A set of data is typically consistent with an infinite number of hypotheses; therefore, there must be factors other than the data that determine the output of the learning algorithm. In machine learning, these other factors are called the bias of the learner. Classical learning algorithms have a fixed bias, implicit in their design. Recently developed learning algorithms dynamically adjust their bias as they search for a hypothesis. Algorithms that shift bias in this manner are not as well understood as classical algorithms. In this paper, we show that the Baldwin effect has implications for the design and analysis of bias shifting algorithms. The Baldwin effect was proposed in 1896, to explain how phenomena that might appear to require Lamarckian evolution (inheritance of acquired characteristics) can arise from purely Darwinian evolution. Hinton and Nowlan presented a computational model of the Baldwin effect in 1987. We explore a variation on their model, which we constructed explicitly to illustrate the lessons that the Baldwin effect has for research in bias shifting algorithms. The main lesson is that it appears that a good strategy for shift of bias in a learning algorithm is to begin with a weak bias and gradually shift to a strong bias

    Contextuality and inductive bias in quantum machine learning

    Full text link
    Generalisation in machine learning often relies on the ability to encode structures present in data into an inductive bias of the model class. To understand the power of quantum machine learning, it is therefore crucial to identify the types of data structures that lend themselves naturally to quantum models. In this work we look to quantum contextuality -- a form of nonclassicality with links to computational advantage -- for answers to this question. We introduce a framework for studying contextuality in machine learning, which leads us to a definition of what it means for a learning model to be contextual. From this, we connect a central concept of contextuality, called operational equivalence, to the ability of a model to encode a linearly conserved quantity in its label space. A consequence of this connection is that contextuality is tied to expressivity: contextual model classes that encode the inductive bias are generally more expressive than their noncontextual counterparts. To demonstrate this, we construct an explicit toy learning problem -- based on learning the payoff behaviour of a zero-sum game -- for which this is the case. By leveraging tools from geometric quantum machine learning, we then describe how to construct quantum learning models with the associated inductive bias, and show through our toy problem that they outperform their corresponding classical surrogate models. This suggests that understanding learning problems of this form may lead to useful insights about the power of quantum machine learning.Comment: comments welcom

    The Fast and the Flexible: training neural networks to learn to follow instructions from small data

    Get PDF
    Learning to follow human instructions is a long-pursued goal in artificial intelligence. The task becomes particularly challenging if no prior knowledge of the employed language is assumed while relying only on a handful of examples to learn from. Work in the past has relied on hand-coded components or manually engineered features to provide strong inductive biases that make learning in such situations possible. In contrast, here we seek to establish whether this knowledge can be acquired automatically by a neural network system through a two phase training procedure: A (slow) offline learning stage where the network learns about the general structure of the task and a (fast) online adaptation phase where the network learns the language of a new given speaker. Controlled experiments show that when the network is exposed to familiar instructions but containing novel words, the model adapts very efficiently to the new vocabulary. Moreover, even for human speakers whose language usage can depart significantly from our artificial training language, our network can still make use of its automatically acquired inductive bias to learn to follow instructions more effectively

    Variability, negative evidence, and the acquisition of verb argument constructions

    Get PDF
    We present a hierarchical Bayesian framework for modeling the acquisition of verb argument constructions. It embodies a domain-general approach to learning higher-level knowledge in the form of inductive constraints (or overhypotheses), and has been used to explain other aspects of language development such as the shape bias in learning object names. Here, we demonstrate that the same model captures several phenomena in the acquisition of verb constructions. Our model, like adults in a series of artificial language learning experiments, makes inferences about the distributional statistics of verbs on several levels of abstraction simultaneously. It also produces the qualitative learning patterns displayed by children over the time course of acquisition. These results suggest that the patterns of generalization observed in both children and adults could emerge from basic assumptions about the nature of learning. They also provide an example of a broad class of computational approaches that can resolve Baker's Paradox
    • …
    corecore