1,222 research outputs found

    Epidemiological Prediction using Deep Learning

    Get PDF
    Department of Mathematical SciencesAccurate and real-time epidemic disease prediction plays a significant role in the health system and is of great importance for policy making, vaccine distribution and disease control. From the SIR model by Mckendrick and Kermack in the early 1900s, researchers have developed a various mathematical model to forecast the spread of disease. With all attempt, however, the epidemic prediction has always been an ongoing scientific issue due to the limitation that the current model lacks flexibility or shows poor performance. Owing to the temporal and spatial aspect of epidemiological data, the problem fits into the category of time-series forecasting. To capture both aspects of the data, this paper proposes a combination of recent Deep Leaning models and applies the model to ILI (influenza like illness) data in the United States. Specifically, the graph convolutional network (GCN) model is used to capture the geographical feature of the U.S. regions and the gated recurrent unit (GRU) model is used to capture the temporal dynamics of ILI. The result was compared with the Deep Learning model proposed by other researchers, demonstrating the proposed model outperforms the previous methods.clos

    Factorized second order methods in neural networks

    Full text link
    Les méthodes d'optimisation de premier ordre (descente de gradient) ont permis d'obtenir des succès impressionnants pour entrainer des réseaux de neurones artificiels. Les méthodes de second ordre permettent en théorie d'accélérer l'optimisation d'une fonction, mais dans le cas des réseaux de neurones le nombre de variables est bien trop important. Dans ce mémoire de maitrise, je présente les méthodes de second ordre habituellement appliquées en optimisation, ainsi que des méthodes approchées qui permettent de les appliquer aux réseaux de neurones profonds. J'introduis un nouvel algorithme basé sur une approximation des méthodes de second ordre, et je valide empiriquement qu'il présente un intérêt pratique. J'introduis aussi une modification de l'algorithme de rétropropagation du gradient, utilisé pour calculer efficacement les gradients nécessaires aux méthodes d'optimisation.First order optimization methods (gradient descent) have enabled impressive successes for training artificial neural networks. Second order methods theoretically allow accelerating optimization of functions, but in the case of neural networks the number of variables is far too big. In this master's thesis, I present usual second order methods, as well as approximate methods that allow applying them to deep neural networks. I introduce a new algorithm based on an approximation of second order methods, and I experimentally show that it is of practical interest. I also introduce a modification of the backpropagation algorithm, used to efficiently compute the gradients required in optimization

    DeepOBS: A Deep Learning Optimizer Benchmark Suite

    Full text link
    Because the choice and tuning of the optimizer affects the speed, and ultimately the performance of deep learning, there is significant past and recent research in this area. Yet, perhaps surprisingly, there is no generally agreed-upon protocol for the quantitative and reproducible evaluation of optimization strategies for deep learning. We suggest routines and benchmarks for stochastic optimization, with special focus on the unique aspects of deep learning, such as stochasticity, tunability and generalization. As the primary contribution, we present DeepOBS, a Python package of deep learning optimization benchmarks. The package addresses key challenges in the quantitative assessment of stochastic optimizers, and automates most steps of benchmarking. The library includes a wide and extensible set of ready-to-use realistic optimization problems, such as training Residual Networks for image classification on ImageNet or character-level language prediction models, as well as popular classics like MNIST and CIFAR-10. The package also provides realistic baseline results for the most popular optimizers on these test problems, ensuring a fair comparison to the competition when benchmarking new optimizers, and without having to run costly experiments. It comes with output back-ends that directly produce LaTeX code for inclusion in academic publications. It supports TensorFlow and is available open source.Comment: Accepted at ICLR 2019. 9 pages, 3 figures, 2 table

    Representation Learning: A Review and New Perspectives

    Full text link
    The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning

    Convolutional Support Vector Machines For Image Classification

    Get PDF
    The Convolutional Neural Network (CNN) is a machine learning model which excels in tasks that exhibit spatially local correlation of features, for example, image classification. However, as a model, it is susceptible to the issues caused by local minima, largely due to the fully-connected neural network which is typically used in the final layers for classification. This work investi- gates the effect of replacing the fully-connected neural network with a Support Vector Machine (SVM). It names the resulting model the Convolutional Support Vector Machine (CSVM) and proposes two methods for training. The first method uses a linear SVM and it is described in the primal. The second method can be used to learn a SVM with a non-linear kernel by casting the optimisation as a Multiple Kernel Learning problem. Both methods learn the convolutional filter weights in conjunction with the SVM parameters. The linear CSVM (L-CSVM) and kernelised CSVM (K-CSVM) in this work each use a single convolutional filter, however, approaches are described which may be used to extend the K-CSVM with multiple filters per layer and with multiple convolutional layers. The L-CSVM and K-CSVM show promising results on the MNIST and CIFAR-10 benchmark datasets

    An Introduction to Variational Autoencoders

    Full text link
    Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions
    corecore