1,733,030 research outputs found

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight-sharing

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight sharing

    Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation

    Get PDF
    Using a mix of shared and language-specific (LS) parameters has shown promise in multilingual neural machine translation (MNMT), but the question of when and where LS capacity matters most is still under-studied. We offer such a study by proposing conditional language-specific routing (CLSR). CLSR employs hard binary gates conditioned on token representations to dynamically select LS or shared paths. By manipulating these gates, it can schedule LS capacity across sub-layers in MNMT subject to the guidance of translation signals and budget constraints. Moreover, CLSR can easily scale up to massively multilingual settings. Experiments with Transformer on OPUS-100 and WMT datasets show that: 1) MNMT is sensitive to both the amount and the position of LS modeling: distributing 10%-30% LS computation to the top and/or bottom encoder/decoder layers delivers the best performance; and 2) one-to-many translation benefits more from CLSR compared to many-to-one translation, particularly with unbalanced training data. Our study further verifies the trade-off between the shared capacity and LS capacity for multilingual translation. We corroborate our analysis by confirming the soundness of our findings as foundation of our improved multilingual Transformers. Source code and models are available at https://github.com/bzhangGo/zero/tree/iclr2021_clsr. Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics One-sentence Summary: We investigate and improve parameter-sharing strategies in multilingual Transformers by utilizing conditional computation

    A learning design toolkit to create pedagogically effective learning activities

    Get PDF
    Despite the plethora of Information and Communication Technologies (ICT) tools and resources available, practitioners are still not making effective use of e-learning to enrich the student experience. This article describes a learning design toolkit which guides practitioners through the process of creating pedagogically informed learning activities which make effective use of appropriate tools and resources. This work is part of a digital libraries project in which teaching staff at two universities in the UK and two in the USA are collaborating to share e-learning resources in the subject domains of Physical, Environmental and Human Geography. Finding, or creating, suitable e-learning resources and embedding them in well designed learning activities can be both challenging and time consuming. Sharing and adapting effective designs and solutions is both a stimulant and a time saver. This article describes the background to the specification of a learning activities design toolkit to support teachers as they create or adapt e-learning activities. This uses a model of pedagogical approaches as a basis for developing effective learning design plans and illustrates its use. The authors share their definition of a learning activity and taxonomies for the constituent elements. Real examples are discussed to illustrate their approach

    Learning in Conferences

    Get PDF
    {Excerpt} The true value of a conference lies in its effects on participants. Conferences are to generate and share knowledge that impacts behavior and links to results: this will not happen if the state-of-the-art of conference evaluation remains immature and event planners do not shine a light on the conditions for learning outcomes

    Learning and Noisy Equilibrium Behavior in an Experimental Study of Imperfect Price Competition

    Get PDF
    This paper considers a duopoly price-choice game in which the unique Nash equilibrium is the Bertrand outcome. Price competition, however, is imperfect in the sense that the market share of the high-price firm is not zero. Economic intuition suggests that price levels should be positively related to the market share of the high-price firm. Although this relationship is not predicted by standard game theory, it is implied by a generalization of the Nash equilibrium that results when players make noisy (logit) best responses to expected payoff differences. This logit equilibrium model was used to design a laboratory experiment with treatments that correspond to changing the market share of the high-price firm. The model predicts the final-period price averages for both treatments with remarkable accuracy. Moreover computer simulations of a naive learning model were used, ex ante, to predict the observed differences in the time paths of average prices.laboratory experiments, simulation, decision error, learning, logit equilibrium.

    Learning from networked examples

    Get PDF
    Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities

    Learning Redundant Motor Tasks With and Without Overlapping Dimensions: Facilitation and Interference Effects

    Get PDF
    Prior learning of a motor skill creates motor memories that can facilitate or interfere with learning of new, but related, motor skills. One hypothesis of motor learning posits that for a sensorimotor task with redundant degrees of freedom, the nervous system learns the geometric structure of the task and improves performance by selectively operating within that task space. We tested this hypothesis by examining if transfer of learning between two tasks depends on shared dimensionality between their respective task spaces. Human participants wore a data glove and learned to manipulate a computer cursor by moving their fingers. Separate groups of participants learned two tasks: a prior task that was unique to each group and a criterion task that was common to all groups. We manipulated the mapping between finger motions and cursor positions in the prior task to define task spaces that either shared or did not share the task space dimensions (x-y axes) of the criterion task. We found that if the prior task shared task dimensions with the criterion task, there was an initial facilitation in criterion task performance. However, if the prior task did not share task dimensions with the criterion task, there was prolonged interference in learning the criterion task due to participants finding inefficient task solutions. These results show that the nervous system learns the task space through practice, and that the degree of shared task space dimensionality influences the extent to which prior experience transfers to subsequent learning of related motor skills
    • …
    corecore