1,733,030 research outputs found
Learning in the Machine: To Share or Not to Share?
Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight-sharing
Learning in the Machine: To Share or Not to Share?
Weight-sharing is one of the pillars behind Convolutional Neural Networks and
their successes. However, in physical neural systems such as the brain,
weight-sharing is implausible. This discrepancy raises the fundamental question
of whether weight-sharing is necessary. If so, to which degree of precision? If
not, what are the alternatives? The goal of this study is to investigate these
questions, primarily through simulations where the weight-sharing assumption is
relaxed. Taking inspiration from neural circuitry, we explore the use of Free
Convolutional Networks and neurons with variable connection patterns. Using
Free Convolutional Networks, we show that while weight-sharing is a pragmatic
optimization approach, it is not a necessity in computer vision applications.
Furthermore, Free Convolutional Networks match the performance observed in
standard architectures when trained using properly translated data (akin to
video). Under the assumption of translationally augmented data, Free
Convolutional Networks learn translationally invariant representations that
yield an approximate form of weight sharing
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Using a mix of shared and language-specific (LS) parameters has shown promise in multilingual neural machine translation (MNMT), but the question of when and where LS capacity matters most is still under-studied. We offer such a study by proposing conditional language-specific routing (CLSR). CLSR employs hard binary gates conditioned on token representations to dynamically select LS or shared paths. By manipulating these gates, it can schedule LS capacity across sub-layers in MNMT subject to the guidance of translation signals and budget constraints. Moreover, CLSR can easily scale up to massively multilingual settings. Experiments with Transformer on OPUS-100 and WMT datasets show that: 1) MNMT is sensitive to both the amount and the position of LS modeling: distributing 10%-30% LS computation to the top and/or bottom encoder/decoder layers delivers the best performance; and 2) one-to-many translation benefits more from CLSR compared to many-to-one translation, particularly with unbalanced training data. Our study further verifies the trade-off between the shared capacity and LS capacity for multilingual translation. We corroborate our analysis by confirming the soundness of our findings as foundation of our improved multilingual Transformers. Source code and models are available at https://github.com/bzhangGo/zero/tree/iclr2021_clsr.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We investigate and improve parameter-sharing strategies in multilingual Transformers by utilizing conditional computation
A learning design toolkit to create pedagogically effective learning activities
Despite the plethora of Information and Communication Technologies (ICT) tools and resources available, practitioners are still not making effective use of e-learning to enrich the student experience. This article describes a learning design toolkit which guides practitioners through the process of creating pedagogically informed learning activities which make effective use of appropriate tools and resources. This work is part of a digital libraries project in which teaching staff at two universities in the UK and two in the USA are collaborating to share e-learning resources in the subject domains of Physical, Environmental and Human Geography. Finding, or creating, suitable e-learning resources and embedding them in well designed learning activities can be both challenging and time consuming. Sharing and adapting effective designs and solutions is both a stimulant and a time saver. This article describes the background to the specification of a learning activities design toolkit to support teachers as they create or adapt e-learning activities. This uses a model of pedagogical approaches as a basis for developing effective learning design plans and illustrates its use. The authors share their definition of a learning activity and taxonomies for the constituent elements. Real examples are discussed to illustrate their approach
Learning in Conferences
{Excerpt} The true value of a conference lies in its effects on participants. Conferences are to generate and share knowledge that impacts behavior and links to results: this will not happen if the state-of-the-art of conference evaluation remains immature and event planners do not shine a light on the conditions for learning outcomes
Learning and Noisy Equilibrium Behavior in an Experimental Study of Imperfect Price Competition
This paper considers a duopoly price-choice game in which the unique Nash equilibrium is the Bertrand outcome. Price competition, however, is imperfect in the sense that the market share of the high-price firm is not zero. Economic intuition suggests that price levels should be positively related to the market share of the high-price firm. Although this relationship is not predicted by standard game theory, it is implied by a generalization of the Nash equilibrium that results when players make noisy (logit) best responses to expected payoff differences. This logit equilibrium model was used to design a laboratory experiment with treatments that correspond to changing the market share of the high-price firm. The model predicts the final-period price averages for both treatments with remarkable accuracy. Moreover computer simulations of a naive learning model were used, ex ante, to predict the observed differences in the time paths of average prices.laboratory experiments, simulation, decision error, learning, logit equilibrium.
Learning from networked examples
Many machine learning algorithms are based on the assumption that training
examples are drawn independently. However, this assumption does not hold
anymore when learning from a networked sample because two or more training
examples may share some common objects, and hence share the features of these
shared objects. We show that the classic approach of ignoring this problem
potentially can have a harmful effect on the accuracy of statistics, and then
consider alternatives. One of these is to only use independent examples,
discarding other information. However, this is clearly suboptimal. We analyze
sample error bounds in this networked setting, providing significantly improved
results. An important component of our approach is formed by efficient sample
weighting schemes, which leads to novel concentration inequalities
Learning Redundant Motor Tasks With and Without Overlapping Dimensions: Facilitation and Interference Effects
Prior learning of a motor skill creates motor memories that can facilitate or interfere with learning of new, but related, motor skills. One hypothesis of motor learning posits that for a sensorimotor task with redundant degrees of freedom, the nervous system learns the geometric structure of the task and improves performance by selectively operating within that task space. We tested this hypothesis by examining if transfer of learning between two tasks depends on shared dimensionality between their respective task spaces. Human participants wore a data glove and learned to manipulate a computer cursor by moving their fingers. Separate groups of participants learned two tasks: a prior task that was unique to each group and a criterion task that was common to all groups. We manipulated the mapping between finger motions and cursor positions in the prior task to define task spaces that either shared or did not share the task space dimensions (x-y axes) of the criterion task. We found that if the prior task shared task dimensions with the criterion task, there was an initial facilitation in criterion task performance. However, if the prior task did not share task dimensions with the criterion task, there was prolonged interference in learning the criterion task due to participants finding inefficient task solutions. These results show that the nervous system learns the task space through practice, and that the degree of shared task space dimensionality influences the extent to which prior experience transfers to subsequent learning of related motor skills
Recommended from our members
The danger of impersonalisation in mass personalised learning
This paper discusses the dichotomy between socialisation and personalisation, and questions whether the two can coexist. It presents evidence that socialisation does lead to improved student achievement and that there is a significant issue with personalisation, in that it limits social discovery because it does not cater for the development of an energetic learning community to share and exchange information. This is particularly relevant in the context of mass personalisation and must be a key consideration when developing personalised learning environments
- …