80 research outputs found
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Because the choice and tuning of the optimizer affects the speed, and
ultimately the performance of deep learning, there is significant past and
recent research in this area. Yet, perhaps surprisingly, there is no generally
agreed-upon protocol for the quantitative and reproducible evaluation of
optimization strategies for deep learning. We suggest routines and benchmarks
for stochastic optimization, with special focus on the unique aspects of deep
learning, such as stochasticity, tunability and generalization. As the primary
contribution, we present DeepOBS, a Python package of deep learning
optimization benchmarks. The package addresses key challenges in the
quantitative assessment of stochastic optimizers, and automates most steps of
benchmarking. The library includes a wide and extensible set of ready-to-use
realistic optimization problems, such as training Residual Networks for image
classification on ImageNet or character-level language prediction models, as
well as popular classics like MNIST and CIFAR-10. The package also provides
realistic baseline results for the most popular optimizers on these test
problems, ensuring a fair comparison to the competition when benchmarking new
optimizers, and without having to run costly experiments. It comes with output
back-ends that directly produce LaTeX code for inclusion in academic
publications. It supports TensorFlow and is available open source.Comment: Accepted at ICLR 2019. 9 pages, 3 figures, 2 table
MLOS: An Infrastructure for Automated Software Performance Engineering
Developing modern systems software is a complex task that combines business
logic programming and Software Performance Engineering (SPE). The later is an
experimental and labor-intensive activity focused on optimizing the system for
a given hardware, software, and workload (hw/sw/wl) context.
Today's SPE is performed during build/release phases by specialized teams,
and cursed by: 1) lack of standardized and automated tools, 2) significant
repeated work as hw/sw/wl context changes, 3) fragility induced by a
"one-size-fit-all" tuning (where improvements on one workload or component may
impact others). The net result: despite costly investments, system software is
often outside its optimal operating point - anecdotally leaving 30% to 40% of
performance on the table.
The recent developments in Data Science (DS) hints at an opportunity:
combining DS tooling and methodologies with a new developer experience to
transform the practice of SPE. In this paper we present: MLOS, an ML-powered
infrastructure and methodology to democratize and automate Software Performance
Engineering. MLOS enables continuous, instance-level, robust, and trackable
systems optimization. MLOS is being developed and employed within Microsoft to
optimize SQL Server performance. Early results indicated that component-level
optimizations can lead to 20%-90% improvements when custom-tuning for a
specific hw/sw/wl, hinting at a significant opportunity. However, several
research challenges remain that will require community involvement. To this
end, we are in the process of open-sourcing the MLOS core infrastructure, and
we are engaging with academic institutions to create an educational program
around Software 2.0 and MLOS ideas.Comment: 4 pages, DEEM 202
Hyperparameter Tuning for Machine and Deep Learning with R
This open access book provides a wealth of hands-on examples that illustrate how hyperparameter tuning can be applied in practice and gives deep insights into the working mechanisms of machine learning (ML) and deep learning (DL) methods. The aim of the book is to equip readers with the ability to achieve better results with significantly less time, costs, effort and resources using the methods described here. The case studies presented in this book can be run on a regular desktop or notebook computer. No high-performance computing facilities are required. The idea for the book originated in a study conducted by Bartz & Bartz GmbH for the Federal Statistical Office of Germany (Destatis). Building on that study, the book is addressed to practitioners in industry as well as researchers, teachers and students in academia. The content focuses on the hyperparameter tuning of ML and DL algorithms, and is divided into two main parts: theory (Part I) and application (Part II). Essential topics covered include: a survey of important model parameters; four parameter tuning studies and one extensive global parameter tuning study; statistical analysis of the performance of ML and DL methods based on severity; and a new, consensus-ranking-based way to aggregate and analyze results from multiple algorithms. The book presents analyses of more than 30 hyperparameters from six relevant ML and DL methods, and provides source code so that users can reproduce the results. Accordingly, it serves as a handbook and textbook alike
A Structural Approach to the Design of Domain Specific Neural Network Architectures
This is a master's thesis concerning the theoretical ideas of geometric deep
learning. Geometric deep learning aims to provide a structured characterization
of neural network architectures, specifically focused on the ideas of
invariance and equivariance of data with respect to given transformations.
This thesis aims to provide a theoretical evaluation of geometric deep
learning, compiling theoretical results that characterize the properties of
invariant neural networks with respect to learning performance.Comment: 94 pages and 16 Figures Upload of my Master's thesis. Not peer
reviewed and potentially contains error
Hyperparameter Tuning for Machine and Deep Learning with R
This open access book provides a wealth of hands-on examples that illustrate how hyperparameter tuning can be applied in practice and gives deep insights into the working mechanisms of machine learning (ML) and deep learning (DL) methods. The aim of the book is to equip readers with the ability to achieve better results with significantly less time, costs, effort and resources using the methods described here. The case studies presented in this book can be run on a regular desktop or notebook computer. No high-performance computing facilities are required. The idea for the book originated in a study conducted by Bartz & Bartz GmbH for the Federal Statistical Office of Germany (Destatis). Building on that study, the book is addressed to practitioners in industry as well as researchers, teachers and students in academia. The content focuses on the hyperparameter tuning of ML and DL algorithms, and is divided into two main parts: theory (Part I) and application (Part II). Essential topics covered include: a survey of important model parameters; four parameter tuning studies and one extensive global parameter tuning study; statistical analysis of the performance of ML and DL methods based on severity; and a new, consensus-ranking-based way to aggregate and analyze results from multiple algorithms. The book presents analyses of more than 30 hyperparameters from six relevant ML and DL methods, and provides source code so that users can reproduce the results. Accordingly, it serves as a handbook and textbook alike
QFAST: Conflating Search and Numerical Optimization for Scalable Quantum Circuit Synthesis
We present a quantum synthesis algorithm designed to produce short circuits
and to scale well in practice. The main contribution is a novel representation
of circuits able to encode placement and topology using generic "gates", which
allows the QFAST algorithm to replace expensive searches over circuit
structures with few steps of numerical optimization. When compared against
optimal depth, search based state-of-the-art techniques, QFAST produces
comparable results: 1.19x longer circuits up to four qubits, with an increase
in compilation speed of 3.6x. In addition, QFAST scales up to seven qubits.
When compared with the state-of-the-art "rule" based decomposition techniques
in Qiskit, QFAST produces circuits shorter by up to two orders of magnitude
(331x), albeit 5.6x slower. We also demonstrate the composability with other
techniques and the tunability of our formulation in terms of circuit depth and
running time
- …