1,023 research outputs found

    Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

    Full text link
    Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.Comment: 10 pages, 5 figures, ICLR201

    A Survey on Bayesian Deep Learning

    Full text link
    A comprehensive artificial intelligence system needs to not only perceive the environment with different `senses' (e.g., seeing and hearing) but also infer the world's conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models. In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, etc. Besides, we also discuss the relationship and differences between Bayesian deep learning and other related topics such as Bayesian treatment of neural networks.Comment: To appear in ACM Computing Surveys (CSUR) 202

    Parallelization of Support Vector Machines

    Get PDF
    Tugivektormasin (Support Vector Machine) on masinõppe meetod, mida kasutakse andmete klassifitseerimiseks. Binaarse klassifikatsiooni probleem seisneb sellise funktsiooni või mudeli leidmisel, mis oskaks ennustada, mis klassi etteantud punkt x kuulub. Mudeli treenimiseks kasutatakse treeningandmeid. Oma töös võrdlesime iteratiivsed ja paralleelseid tugivektormasina algoritmide implementatsioone. Uurimise käigus avastasime et paralleelsed algoritmid, nagu oligi oodatud, töötavad palju kiiremini kui iteratiivsed, seejuures valesti klassifitseeritud punktide arv ei suurene.One of the techniques used for data classification is support vector machine (SVM). SVM takes binary classification as the fundamental problem and follows the geometrically intuitive approach to find a hyperplane that divides objects into two separate classes. The training part of the SVM aims to both maximize the width of the margin that surrounds the separating hyperplane and minimize the occurrence of classification errors. The goal of given thesis is to research efficiency in performance gained by using parallel approach to solve SVM and compare proposed techniques for parallelization in accuracy and computation speed

    Phase space techniques in neural network models

    Get PDF
    corecore