318 research outputs found

    Commercial risk management in the electricity supply industry

    Get PDF

    Architecture of Advanced Numerical Analysis Systems

    Get PDF
    This unique open access book applies the functional OCaml programming language to numerical or computational weighted data science, engineering, and scientific applications. This book is based on the authors' first-hand experience building and maintaining Owl, an OCaml-based numerical computing library. You'll first learn the various components in a modern numerical computation library. Then, you will learn how these components are designed and built up and how to optimize their performance. After reading and using this book, you'll have the knowledge required to design and build real-world complex systems that effectively leverage the advantages of the OCaml functional programming language. What You Will Learn Optimize core operations based on N-dimensional arrays Design and implement an industry-level algorithmic differentiation module Implement mathematical optimization, regression, and deep neural network functionalities based on algorithmic differentiation Design and optimize a computation graph module, and understand the benefits it brings to the numerical computing library Accommodate the growing number of hardware accelerators (e.g. GPU, TPU) and execution backends (e.g. web browser, unikernel) of numerical computation Use the Zoo system for efficient scripting, code sharing, service deployment, and composition Design and implement a distributed computing engine to work with a numerical computing library, providing convenient APIs and high performance Who This Book Is For Those with prior programming experience, especially with the OCaml programming language, or with scientific computing experience who may be new to OCaml. Most importantly, it is for those who are eager to understand not only how to use something, but also how it is built up

    Electrical flexibility in the chemical process industry

    Get PDF

    Scalability of Parallel Batch Pattern Neural Network Training Algorithm

    Get PDF
    The development of parallel batch pattern back propagation training algorithm of multilayer perceptron and its scalability research on general-purpose parallel computer are presented in this paper. The model of multilayer perceptron and batch pattern training algorithm are theoretically described. The algorithmic description of the parallel batch pattern training method is presented. The scalability research of the developed parallel algorithm is fulfilled at progressive increasing the dimension of the parallelized problem on general-purpose parallel computer NEC TX-7.Разработка параллельного группового алгоритма обучения обратного распространения ошибки многослойного персептрона и исследование его масштабированности на параллельном компьютере общего назначения представлены в этой статье. Модель многослойного персептрона и групповой алгоритм его обучения описаны формализованным образом. Параллельный групповой алгоритм обучения представлен в алгоритмическом виде. Исследование масштабированности разработанного параллельного алгоритма осуществлено для пропорционально увеличивающегося размера задачи параллелизации на параллельном компьютере общего назначения NEC TX-7.Розробка паралельного групового алгоритму навчання зворотного поширення помилки багатошарового персептрону та дослідження його масштабованості на паралельному комп’ютері загального призначення розглянуті в цій статті. Модель багатошарового персептрону та груповий алгоритм його навчання описані формалізованим чином. Паралельний груповий алгоритм навчання представлено в алгоритмічному вигляді. Дослідження масштабованості розробленого паралельного алгоритму здійснено для пропорційно збільшуваного розміру задачі паралелізації на паралельному комп’ютері загального призначення NEC TX-7

    Robust and efficient inference and learning algorithms for generative models

    Get PDF
    Generative modelling is a popular paradigm in machine learning due to its natural ability to describe uncertainty in data and models and for its applications including data compression (Ho et al., 2020), missing data imputation (Valera et al., 2018), synthetic data generation (Lin et al., 2020), representation learning (Kingma and Welling, 2014), robust classification (Li et al., 2019b), and more. For generative models, the task of finding the distribution of unobserved variables conditioned on observed ones is referred to as inference. Finding the optimal model that makes the model distribution close to the data distribution according to some discrepancy measures is called learning. In practice, existing learning and inference methods can fall short on robustness and efficiency. A method that is more robust to its hyper-parameters or different types of data can be more easily adapted to various real-world applications. How efficient a method is in regard to the size and the dimensionality of data determines at what scale the method can be applied. This thesis presents four pieces of my original work that improves these properties in generative models. First, I introduce two novel Bayesian inference algorithms. One is called coupled multinomial Hamiltonian Monte Carlo (Xu et al., 2021a); it builds on Heng and Jacob (2019), which is a recent work in unbiased Markov chain Monte Carlo (MCMC) (Jacob et al., 2019b) and has been found to sensitive to hyper-parameters and less efficient compared to normal, biased MCMC. These issues are solved by establishing couplings to the widely-used multinomial Hamiltonian Monte Carlo, leading to a statistically more efficient and robust method. The other method is called roulette-based variational expectation (RAVE; Xu et al., 2019) that applies amortised inference to a model family called Bayesian non-parametric models, in which the number of parameters are allowed to grow unbounded as the data gets more complex. Unlike previous sampling-based methods that are slow or variational inference methods that rely on truncation, RAVE combines the advantages of both to achieve flexible inference that is also computational efficient. Second, I introduce two novel learning methods. One is called generative ratio-matching (Srivastava et al., 2019) which is a learning algorithm that makes deep generative models based on kernel methods applicable to high-dimensional data. The key innovation of this method is learning a projection of the data to a lower-dimensional space in which the density ratio is preserved such that learning can be done in the lowerdimensional space where kernel methods are effective. The other method is called Bayesian symbolic physics that combines Bayesian inference and symbolic regression in the context of naïve physics—the study of how humans understand and learn physics. Unlike classic generative models for which the structure of the generative process is predefined or deep generative models where the process is represented by data-hungry neural networks, Bayesian-symbolic generative processes are defined by functions over a hypothesis space specified by a context-free grammar. This formulation allows these models to incorporate domain knowledge in learning, which gives highly-improved sample efficiency. For all four pieces of work, I provide theoretical analyses and/or empirical results to validate that the algorithmic advances lead to improvements in robustness and efficiency for generative models. Lastly, I summarise my contributions to free and open-source software on generative modelling. This includes a set of Julia packages that I contributed and are currently used by the Turing probabilistic programming language (Ge et al., 2018). These packages, which are highly reusable components for building probabilistic programming languages, together form a probabilistic programming ecosystem in Julia. An important package that is primarily developed by me is called ADVANCEDHMC.JL (Xu et al., 2020), which provides robust and efficient implementations of HMC methods and has been adopted as the backend of Turing. Importantly, the design of this package allows an intuitive abstraction to construct HMC samplers similarly to how they are mathematically defined. The promise of these open-source packages is to make generative modelling techniques more accessible to domain experts from various backgrounds and to make relevant research more reproducible to help advance the field

    Improving the performance of dataflow systems for deep neural network training

    No full text
    Deep neural networks (DNNs) have led to significant advancements in machine learning. With deep structure and flexible model parameterisation, they exhibit state-of-the-art accuracies for many complex tasks e.g. image recognition. To achieve this, models are trained iteratively over large datasets. This process involves expensive matrix operations, making it time-consuming to obtain converged models. To accelerate training, dataflow systems parallelise computation. A scalable approach is to use parameter server framework: it has workers that train model replicas in parallel and parameter servers that synchronise the replicas to ensure the convergence. With distributed DNN systems, there are three challenges that determine the training completion time. In this thesis, we propose practical and effective techniques to address each of these challenges. Since frequent model synchronisation results in high network utilisation, the parameter server approach can suffer from network bottlenecks, thus requiring decisions on resource allocation. Our idea is to use all available network bandwidth and synchronise subject to the available bandwidth. We present Ako, a DNN system that uses partial gradient exchange for synchronising replicas in a peer-to-peer fashion. We show that our technique exhibits a 25% lower convergence time than a hand-tuned parameter-server deployments. For a long training, the compute efficiency of worker nodes is important. We argue that processing hardware should be fully utilised for the best speed-up. The key observation is it is possible to overlap the execution of several matrix operations with other workloads. We describe Crossbow, a GPU-based system that maximises hardware utilisation. By using a multi-streaming scheduler, multiple models are trained in parallel on GPU and achieve a 2.3x speed-up compared to a state-of-the-art system. The choice of model configuration for replicas also directly determines convergence quality. Dataflow systems are used for exploring the promising configurations but provide little support for efficient exploratory workflows. We present Meta-dataflow (MDF), a dataflow model that expresses complex workflows. By taking into account all configurations as a unified workflow, MDFs efficiently reduce time spent on configuration exploration.Open Acces
    corecore