11,187 research outputs found

    EMI: Exploration with Mutual Information

    Full text link
    Reinforcement learning algorithms struggle when the reward signal is very sparse. In these cases, naive random exploration methods essentially rely on a random walk to stumble onto a rewarding state. Recent works utilize intrinsic motivation to guide the exploration via generative models, predictive forward models, or discriminative modeling of novelty. We propose EMI, which is an exploration method that constructs embedding representation of states and actions that does not rely on generative decoding of the full observation but extracts predictive signals that can be used to guide exploration based on forward prediction in the representation space. Our experiments show competitive results on challenging locomotion tasks with continuous control and on image-based exploration tasks with discrete actions on Atari. The source code is available at https://github.com/snu-mllab/EMI .Comment: Accepted and to appear at ICML 201

    Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data

    Full text link
    On-device machine learning (ML) enables the training process to exploit a massive amount of user-generated private data samples. To enjoy this benefit, inter-device communication overhead should be minimized. With this end, we propose federated distillation (FD), a distributed model training algorithm whose communication payload size is much smaller than a benchmark scheme, federated learning (FL), particularly when the model size is large. Moreover, user-generated data samples are likely to become non-IID across devices, which commonly degrades the performance compared to the case with an IID dataset. To cope with this, we propose federated augmentation (FAug), where each device collectively trains a generative model, and thereby augments its local data towards yielding an IID dataset. Empirical studies demonstrate that FD with FAug yields around 26x less communication overhead while achieving 95-98% test accuracy compared to FL.Comment: presented at the 32nd Conference on Neural Information Processing Systems (NIPS 2018), 2nd Workshop on Machine Learning on the Phone and other Consumer Devices (MLPCD 2), Montr\'eal, Canad

    A new intrinsically knotted graph with 22 edges

    Full text link
    A graph is called intrinsically knotted if every embedding of the graph contains a knotted cycle. Johnson, Kidwell and Michael showed that intrinsically knotted graphs have at least 21 edges. Recently Lee, Kim, Lee and Oh, and, independently, Barsotti and Mattman, showed that K7K_7 and the 13 graphs obtained from K7K_7 by Y\nabla Y moves are the only intrinsically knotted graphs with 21 edges. In this paper we present the following results: there are exactly three triangle-free intrinsically knotted graphs with 22 edges having at least two vertices of degree 5. Two are the cousins 94 and 110 of the E9+eE_9+e family and the third is a previously unknown graph named M11M_{11}. These graphs are shown in Figure 3 and 4. Furthermore, there is no triangle-free intrinsically knotted graph with 22 edges that has a vertex with degree larger than 5

    Classification of scale-free networks

    Full text link
    While the emergence of a power law degree distribution in complex networks is intriguing, the degree exponent is not universal. Here we show that the betweenness centrality displays a power-law distribution with an exponent \eta which is robust and use it to classify the scale-free networks. We have observed two universality classes with \eta \approx 2.2(1) and 2.0, respectively. Real world networks for the former are the protein interaction networks, the metabolic networks for eukaryotes and bacteria, and the co-authorship network, and those for the latter one are the Internet, the world-wide web, and the metabolic networks for archaea. Distinct features of the mass-distance relation, generic topology of geodesics and resilience under attack of the two classes are identified. Various model networks also belong to either of the two classes while their degree exponents are tunable.Comment: 6 Pages, 6 Figures, 1 tabl

    Optimal Schedules in Multitask Motor Learning

    Get PDF
    Although scheduling multiple tasks in motor learning to maximize long-term retention of performance is of great practical importance in sports training and motor rehabilitation after brain injury, it is unclear how to do so. We propose here a novel theoretical approach that uses optimal control theory and computational models of motor adaptation to determine schedules that maximize long-term retention predictively. Using Pontryagin’s maximum principle, we derived a control law that determines the trial-by-trial task choice that maximizes overall delayed retention for all tasks, as predicted by the state-space model. Simulations of a single session of adaptation with two tasks show that when task interference is high, there exists a threshold in relative task difficulty below which the alternating schedule is optimal. Only for large differences in task difficulties do optimal schedules assign more trials to the harder task. However, over the parameter range tested, alternating schedules yield long-term retention performance that is only slightly inferior to performance given by the true optimal schedules. Our results thus predict that in a large number of learning situations wherein tasks interfere, intermixing tasks with an equal number of trials is an effective strategy in enhancing long-term retention
    corecore