56 research outputs found

    How important is weight symmetry in backpropagation?

    Get PDF
    Gradient backpropagation (BP) requires symmetric feedforward and feedback connections-the same weights must be used for forward and backward passes. This "weight transport problem" (Grossberg 1987) is thought to be one of the main reasons to doubt BP's biologically plausibility. Using 15 different classification datasets, we systematically investigate to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et al.'s demonstration (Lillicrap et al. 2014) but orthogonal in its results, our experiments indicate that: (1) the magnitudes of feedback weights do not matter to performance (2) the signs of feedback weights do matter-the more concordant signs between feedforward and their corresponding feedback connections, the better (3) with feedback weights having random magnitudes and 100% concordant signs, we were able to achieve the same or even better performance than SGD. (4) some normalizations/stabilizations are indispensable for such asymmetric BP to work, namely Batch Normalization (BN) (Ioffe and Szegedy 2015) and/or a "Batch Manhattan" (BM) update rule.National Science Foundation (U.S.) (STC Award CCF 1231216

    How Important is Weight Symmetry in Backpropagation?

    Get PDF
    Gradient backpropagation (BP) requires symmetric feedforward and feedback connections—the same weights must be used for forward and backward passes. This “weight transport problem” [1] is thought to be one of the main reasons of BP’s biological implausibility. Using 15 different classification datasets, we systematically study to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et al.’s demonstration [2] but orthogonal in its results, our experiments indicate that: (1) the magnitudes of feedback weights do not matter to performance (2) the signs of feedback weights do matter—the more concordant signs between feedforward and their corresponding feedback connections, the better (3) with feedback weights having random magnitudes and 100% concordant signs, we were able to achieve the same or even better performance than SGD. (4) some normalizations/stabilizations are indispensable for such asymmetric BP to work, namely Batch Normalization (BN) [3] and/or a “Batch Manhattan” (BM) update rule.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

    Hands-on Bayesian Neural Networks -- a Tutorial for Deep Learning Users

    Full text link
    Modern deep learning methods constitute incredibly powerful tools to tackle a myriad of challenging problems. However, since deep learning methods operate as black boxes, the uncertainty associated with their predictions is often challenging to quantify. Bayesian statistics offer a formalism to understand and quantify the uncertainty associated with deep neural network predictions. This tutorial provides an overview of the relevant literature and a complete toolset to design, implement, train, use and evaluate Bayesian Neural Networks, i.e. Stochastic Artificial Neural Networks trained using Bayesian methods.Comment: 35 pages, 15 figure

    Is Evolution an Algorithm? Effects of local entropy in unsupervised learning and protein evolution

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Demystifying ANN with Mathematical and Graphical Insights: An Algorithmic Review for Beginners

    Get PDF
    Developments in deep learning with ANNs (Artificial Neural Networks) are paving the way for revolutionizing all application areas, especially related to non-linear regression and classification problems of predictive modelling and forecasting. Although their explainability is more complicated and challenging, deep neural networks are preferred over conventional machine learning methods for high accuracy in non-linear and complex problems. However, machine learning and data science practitioners often use ANN like a black-box. The present article concisely overviews the mathematics and computations involved in simple feed-forward neural networks (FNNs) or multilayer perceptrons (MLPs). The purpose is to spot light on what deep neural networks’ learning (or training) is and how it works. The article includes simplified derivations of the expressions for the main workhorse of neural networks (the backpropagation) and an example to explain how it works with graphical insights. An algorithm for a basic ANN application is presented in both component-form and matrix-form, together with a detailed note on the relevant data structures, to elaborate the scheme comprehensively. Python implementation of the basic algorithm is presented, and its performance results are compared with those produced using the TensorFlow library functions that implement the neural networks. The article discusses various techniques to improve the generalization capability of neural networks and how to address various training challenges. Finally, some well-established optimization approaches based on the Gradient Descent method are also discussed. The article may serve as a comprehensive premiere for a sound understanding of deep learning for undergraduate and graduate students before indulging in the relevant industry practices so that they can step into sustainable progress in the field

    Machine Learning methodology for the study of a disulfide exchange reaction

    Get PDF
    Die Thiol-Disulfid-Austauschreaktion ist eine nukleophile Substitution, die in einer großen Klasse von Proteinen stattfindet. Sie spielt eine wichtige Rolle für die dritt- und viertdimensionale Struktur von Proteinen und die Katalyse biologischer Reaktionen. Außerdem kann der Thiol-Disulfid-Austausch die Aktivität bestimmter Proteine regulieren. In dieser Arbeit werden die strukturellen und Umgebungsfaktoren, die diese Reaktion beeinflussen, diskutiert. Aufgrund ihrer Recheneffizienz hat sich die Density-Functional based Tight-binding (DFTB) Methode als beliebte und zuverlässige quantenmechanische Methode für Anwendungen in kondensierter Phase positioniert. Mit DFTB ist es möglich die freie Energiefläche komplexer Reaktionen zu erzeugen, da der Phasenraum usreichen abgetastet werden kann. Diese Einsparungen bei den Rechenkosten können jedoch auf Kosten einer geringeren Genauigkeit gehen. Beim Thiol-Disulfid-Austausch zum Beispiel weisen die Übergangszustände eine fehlerhate Struktur und Energie auf. Die Literaturrecherche zeigt, dass für eine korrekte Beschreibung dieser Reaktion sehr genaue ab initio Methoden verwendet werden müssen. Daher bestand die Motivation dieser Arbeit darin, die DFTB-Fehler mit einem maschinellen Lernansatz zu korrigieren. Um dies zu erreichen, haben wir ein neuronales Netzwerk vom Typ Behler-Parrinello verwendet, das die Energiewertdifferenzen zwischen der ab initio und DFTB Methode füreine gegebene Molekülstruktur erlernt. Die maschinell erlernte Energiekorrektur wurde dann in die DFTB+ Software implementiert. Mit diesem neuen Ansatz konnten wir hybride Quantum Mechanics/Molecular Mechanics (QM/MM)-Simulationen des Thiol-Disulfid-Austauschs mit Coupled Cluster und B3LYP-Genauigkeit mit einem Rechenaufwand durchführen, der mit DFTB vergleichbar ist. Dieser Korrekturalgorithmus ist auch in einer Pipeline mit grafischer Schnittstelle implementiert, die dem Benutzer hilft, Trainingsdaten zu generieren und zu arrangieren sowie das maschinelle Lernmodell in DFTB+ zu exportieren, um es in QM/MM-Simulationen weiter zu verwenden. Die Einführung dieser Pipeline soll die Anwendungsmöglichkeiten des Codes für neuronale Netze erweitern, indem das Wissen über Quantenmodellierung gegenüber einem Programmierhintergrund bevorzugt wird. Darüber hinaus stellen wir erste Arbeiten an einem maschinell erlernten Kraftfeld zur Beschreibung der Disulfid-Austauschreaktion unter Verwendung von Coupled Cluster-Referenzdaten vor

    Boolean Variation and Boolean Logic BackPropagation

    Full text link
    The notion of variation is introduced for the Boolean set and based on which Boolean logic backpropagation principle is developed. Using this concept, deep models can be built with weights and activations being Boolean numbers and operated with Boolean logic instead of real arithmetic. In particular, Boolean deep models can be trained directly in the Boolean domain without latent weights. No gradient but logic is synthesized and backpropagated through layers

    Numerical evaluation of aerodynamic roughness of the built environment and complex terrain

    Get PDF
    Aerodynamic drag in the atmospheric boundary layer (ABL) is affected by the structure and density of obstacles (surface roughness) and nature of the terrain (topography). In building codes and standards, average roughness is usually determined somewhat subjectively by examination of aerial photographs. For detailed wind mapping, boundary layer wind tunnel (BLWT) testing is usually recommended. This may not be cost effective for many projects, in which case numerical studies become good alternatives. This thesis examines Computational Fluid Dynamics (CFD) for evaluation of aerodynamic roughness of the built environment and complex terrain. The present study started from development of an in-house CFD software tailored for ABL simulations. A three-dimensional finite-volume code was developed using flexible polyhedral elements as building blocks. The program is parallelized using MPI to run on clusters of processors so that micro-scale simulations can be conducted quickly. The program can also utilize the power of latest technology in high performance computing, namely GPUs. Various turbulence models including mixing-length, RANS, and LES models are implemented, and their suitability for ABL simulations assessed. Then the effect of surface roughness alone on wind profiles is assessed using CFD. Cases with various levels of complexity are considered including simplified models with roughness blocks of different arrangement, multiple roughness patches, semi-idealized urban model, and real built environment. Comparison with BLWT data for the first three cases showed good agreement thereby justifying explicit three-dimensional numerical approach. Due to lack of validation data, the real built environment case served only to demonstrate use of CFD for such purposes. Finally, the effect of topographic features on wind profiles was investigated using CFD. This work extends prior work done by the research team on multiple idealized two-dimensional topographic features to more elaborate three-dimensional simulations. It is found that two-dimensional simulations overestimate speed up over crests of hills and also show larger recirculation zones. The current study also emphasized turbulence characterization behind hills. Finally a real complex terrain case of the well-known Askervein hill was simulated and the results validated against published field observations. In general the results obtained from the current simulations compared well with those reported in literature
    • …
    corecore