80 research outputs found

    DeepOBS: A Deep Learning Optimizer Benchmark Suite

    Full text link
    Because the choice and tuning of the optimizer affects the speed, and ultimately the performance of deep learning, there is significant past and recent research in this area. Yet, perhaps surprisingly, there is no generally agreed-upon protocol for the quantitative and reproducible evaluation of optimization strategies for deep learning. We suggest routines and benchmarks for stochastic optimization, with special focus on the unique aspects of deep learning, such as stochasticity, tunability and generalization. As the primary contribution, we present DeepOBS, a Python package of deep learning optimization benchmarks. The package addresses key challenges in the quantitative assessment of stochastic optimizers, and automates most steps of benchmarking. The library includes a wide and extensible set of ready-to-use realistic optimization problems, such as training Residual Networks for image classification on ImageNet or character-level language prediction models, as well as popular classics like MNIST and CIFAR-10. The package also provides realistic baseline results for the most popular optimizers on these test problems, ensuring a fair comparison to the competition when benchmarking new optimizers, and without having to run costly experiments. It comes with output back-ends that directly produce LaTeX code for inclusion in academic publications. It supports TensorFlow and is available open source.Comment: Accepted at ICLR 2019. 9 pages, 3 figures, 2 table

    MLOS: An Infrastructure for Automated Software Performance Engineering

    Full text link
    Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context. Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of standardized and automated tools, 2) significant repeated work as hw/sw/wl context changes, 3) fragility induced by a "one-size-fit-all" tuning (where improvements on one workload or component may impact others). The net result: despite costly investments, system software is often outside its optimal operating point - anecdotally leaving 30% to 40% of performance on the table. The recent developments in Data Science (DS) hints at an opportunity: combining DS tooling and methodologies with a new developer experience to transform the practice of SPE. In this paper we present: MLOS, an ML-powered infrastructure and methodology to democratize and automate Software Performance Engineering. MLOS enables continuous, instance-level, robust, and trackable systems optimization. MLOS is being developed and employed within Microsoft to optimize SQL Server performance. Early results indicated that component-level optimizations can lead to 20%-90% improvements when custom-tuning for a specific hw/sw/wl, hinting at a significant opportunity. However, several research challenges remain that will require community involvement. To this end, we are in the process of open-sourcing the MLOS core infrastructure, and we are engaging with academic institutions to create an educational program around Software 2.0 and MLOS ideas.Comment: 4 pages, DEEM 202

    Hyperparameter Tuning for Machine and Deep Learning with R

    Get PDF
    This open access book provides a wealth of hands-on examples that illustrate how hyperparameter tuning can be applied in practice and gives deep insights into the working mechanisms of machine learning (ML) and deep learning (DL) methods. The aim of the book is to equip readers with the ability to achieve better results with significantly less time, costs, effort and resources using the methods described here. The case studies presented in this book can be run on a regular desktop or notebook computer. No high-performance computing facilities are required. The idea for the book originated in a study conducted by Bartz & Bartz GmbH for the Federal Statistical Office of Germany (Destatis). Building on that study, the book is addressed to practitioners in industry as well as researchers, teachers and students in academia. The content focuses on the hyperparameter tuning of ML and DL algorithms, and is divided into two main parts: theory (Part I) and application (Part II). Essential topics covered include: a survey of important model parameters; four parameter tuning studies and one extensive global parameter tuning study; statistical analysis of the performance of ML and DL methods based on severity; and a new, consensus-ranking-based way to aggregate and analyze results from multiple algorithms. The book presents analyses of more than 30 hyperparameters from six relevant ML and DL methods, and provides source code so that users can reproduce the results. Accordingly, it serves as a handbook and textbook alike

    A Structural Approach to the Design of Domain Specific Neural Network Architectures

    Full text link
    This is a master's thesis concerning the theoretical ideas of geometric deep learning. Geometric deep learning aims to provide a structured characterization of neural network architectures, specifically focused on the ideas of invariance and equivariance of data with respect to given transformations. This thesis aims to provide a theoretical evaluation of geometric deep learning, compiling theoretical results that characterize the properties of invariant neural networks with respect to learning performance.Comment: 94 pages and 16 Figures Upload of my Master's thesis. Not peer reviewed and potentially contains error

    Hyperparameter Tuning for Machine and Deep Learning with R

    Get PDF
    This open access book provides a wealth of hands-on examples that illustrate how hyperparameter tuning can be applied in practice and gives deep insights into the working mechanisms of machine learning (ML) and deep learning (DL) methods. The aim of the book is to equip readers with the ability to achieve better results with significantly less time, costs, effort and resources using the methods described here. The case studies presented in this book can be run on a regular desktop or notebook computer. No high-performance computing facilities are required. The idea for the book originated in a study conducted by Bartz & Bartz GmbH for the Federal Statistical Office of Germany (Destatis). Building on that study, the book is addressed to practitioners in industry as well as researchers, teachers and students in academia. The content focuses on the hyperparameter tuning of ML and DL algorithms, and is divided into two main parts: theory (Part I) and application (Part II). Essential topics covered include: a survey of important model parameters; four parameter tuning studies and one extensive global parameter tuning study; statistical analysis of the performance of ML and DL methods based on severity; and a new, consensus-ranking-based way to aggregate and analyze results from multiple algorithms. The book presents analyses of more than 30 hyperparameters from six relevant ML and DL methods, and provides source code so that users can reproduce the results. Accordingly, it serves as a handbook and textbook alike

    QFAST: Conflating Search and Numerical Optimization for Scalable Quantum Circuit Synthesis

    Full text link
    We present a quantum synthesis algorithm designed to produce short circuits and to scale well in practice. The main contribution is a novel representation of circuits able to encode placement and topology using generic "gates", which allows the QFAST algorithm to replace expensive searches over circuit structures with few steps of numerical optimization. When compared against optimal depth, search based state-of-the-art techniques, QFAST produces comparable results: 1.19x longer circuits up to four qubits, with an increase in compilation speed of 3.6x. In addition, QFAST scales up to seven qubits. When compared with the state-of-the-art "rule" based decomposition techniques in Qiskit, QFAST produces circuits shorter by up to two orders of magnitude (331x), albeit 5.6x slower. We also demonstrate the composability with other techniques and the tunability of our formulation in terms of circuit depth and running time
    • …
    corecore