Search CORE

55,301 research outputs found

Born Again Neural Networks

Author: Anandkumar Anima
Furlanello Tommaso
Itti Laurent
Lipton Zachary C.
Tschannen Michael
Publication venue
Publication date: 29/06/2018
Field of study

Knowledge distillation (KD) consists of transferring knowledge from one machine learning model (the teacher}) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student's compactness. %we desire a compact model with performance close to the teacher's. We study KD from a new perspective: rather than compressing models, we train students parameterized identically to their teachers. Surprisingly, these {Born-Again Networks (BANs), outperform their teachers significantly, both on computer vision and language modeling tasks. Our experiments with BANs based on DenseNets demonstrate state-of-the-art performance on the CIFAR-10 (3.5%) and CIFAR-100 (15.5%) datasets, by validation error. Additional experiments explore two distillation objectives: (i) Confidence-Weighted by Teacher Max (CWTM) and (ii) Dark Knowledge with Permuted Predictions (DKPP). Both methods elucidate the essential components of KD, demonstrating a role of the teacher outputs on both predicted and non-predicted classes. We present experiments with students of various capacities, focusing on the under-explored case where students overpower teachers. Our experiments show significant advantages from transferring knowledge between DenseNets and ResNets in either direction.Comment: Published @ICML 201

arXiv.org e-Print Archive

Caltech Authors

Distribution of Damages in Car Accidents throught the Use of Neural Networks

Author: Philipps Lothar
Publication venue
Publication date: 01/01/1991
Field of study

After a traffic accident the damage has to be fairly divided among the parties involved, and a ratio has to be determined. There are many precedents for this, and judges have developed catalogues suggesting ratios for common types of accidents. The problem that "every case is different," however, remains. Many cases have familiar aspects, but also unfamiliar ones. Even if a case is composed of several familiar aspects with established ratios, the question remains as to how these are to be figured into one ratio. The first thought would be to invent a mathematical formula, but such formulae are rigid and speculative. The body of law has grown organically and must not be forced into a sleek system. The distant consequences of using a mathematical formula cannot be foreseen; they might well be grossly unjust. I suggest using a neural network instead. Precedents may be fed into the network directly as learning patterns. This has the advantage that court rulings can be transferred directly and not via a formula. Future modifications in court rulings also can be adopted by the network. As far as the effect of the learning patterns on new cases is concerned, a relatively safe assumption is that they will fit in harmoniously with the precedents. This is due to the network's structure—a number of simple decisional units, which are interconnected, tune their activity to each other, thus achieving a state of equilibrium. When the conditions of such an equilibrium are translated back into the terms of the case, the solution can hardly be totally unjust

Open Access LMU

Distributed representations accelerate evolution of adaptive behaviours

Author: James V Stone
Karl J Friston
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

Animals with rudimentary innate abilities require substantial learning to transform those abilities into useful skills, where a skill can be considered as a set of sensory - motor associations. Using linear neural network models, it is proved that if skills are stored as distributed representations, then within- lifetime learning of part of a skill can induce automatic learning of the remaining parts of that skill. More importantly, it is shown that this " free- lunch'' learning ( FLL) is responsible for accelerated evolution of skills, when compared with networks which either 1) cannot benefit from FLL or 2) cannot learn. Specifically, it is shown that FLL accelerates the appearance of adaptive behaviour, both in its innate form and as FLL- induced behaviour, and that FLL can accelerate the rate at which learned behaviours become innate

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

White Rose Research Online

Training Passive Photonic Reservoirs with Integrated Optical Readout

Author: Bienstman Peter
Dambre Joni
Freiberger Matthias
Katumba Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/10/2018
Field of study

As Moore's law comes to an end, neuromorphic approaches to computing are on the rise. One of these, passive photonic reservoir computing, is a strong candidate for computing at high bitrates (> 10 Gbps) and with low energy consumption. Currently though, both benefits are limited by the necessity to perform training and readout operations in the electrical domain. Thus, efforts are currently underway in the photonic community to design an integrated optical readout, which allows to perform all operations in the optical domain. In addition to the technological challenge of designing such a readout, new algorithms have to be designed in order to train it. Foremost, suitable algorithms need to be able to deal with the fact that the actual on-chip reservoir states are not directly observable. In this work, we investigate several options for such a training algorithm and propose a solution in which the complex states of the reservoir can be observed by appropriately setting the readout weights, while iterating over a predefined input sequence. We perform numerical simulations in order to compare our method with an ideal baseline requiring full observability as well as with an established black-box optimization approach (CMA-ES).Comment: Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems (TNNLS-2017-P-8539.R1), copyright 2018 IEEE. This research was funded by the EU Horizon 2020 PHRESCO Grant (Grant No. 688579) and the BELSPO IAP P7-35 program Photonics@be. 11 pages, 9 figure

arXiv.org e-Print Archive

Ghent University Academic Bibliography

A new neural network technique for the design of multilayered microwave shielded bandpass filters

Author: Cañete Rebenaque David
Gómez Díaz Juan Sebastián
Pascual García Juan
Quesada Pereira Fernando Daniel
Álvarez Melcón Alejandro
Publication venue: 'Wiley'
Publication date: 01/05/2009
Field of study

In this work, we propose a novel technique based on neural networks, for the design of microwave ﬁlters in shielded printed technology. The technique uses radial basis function neural networks to represent the non linear relations between the quality factors and coupling coefﬁcients, with the geometrical dimensions of the resonators. The radial basis function neural networks are employed for the ﬁrst time in the design task of shielded printed ﬁlters, and permit a fast and precise operation with only a limited set of training data. Thanks to a new cascade conﬁguration, a set of two neural networks provide the dimensions of the complete ﬁlter in a fast and accurate way. To improve the calculation of the geometrical dimensions, the neural networks can take as inputs both electrical parameters and physical dimensions computed by other neural networks. The neural network technique is combined with gradient based optimization methods to further improve the response of the ﬁlters. Results are presented to demonstrate the usefulness of the proposed technique for the design of practical microwave printed coupled line and hairpin ﬁlters

Repositorio Digital de la Universidad Politécnica de Cartagena