97 research outputs found
Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs
Training deep neural networks consumes increasing computational resource
shares in many compute centers. Often, a brute force approach to obtain
hyperparameter values is employed. Our goal is (1) to enhance this by enabling
second-order optimization methods with fewer hyperparameters for large-scale
neural networks and (2) to perform a survey of the performance optimizers for
specific tasks to suggest users the best one for their problem. We introduce a
novel second-order optimization method that requires the effect of the Hessian
on a vector only and avoids the huge cost of explicitly setting up the Hessian
for large-scale networks.
We compare the proposed second-order method with two state-of-the-art
optimizers on five representative neural network problems, including regression
and very deep networks from computer vision or variational autoencoders. For
the largest setup, we efficiently parallelized the optimizers with Horovod and
applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.Comment: Accepted to PPAM conferenc
Free-Surface Lattice-Boltzmann Simulation on Many-Core Architectures
AbstractCurrent advances in many-core technologies demand simulation algorithms suited for the corresponding architectures while with regard to the respective increase of computational power, real-time and interactive simulations become possible and desirable. We present an OpenCL implementation of a Lattice-Boltzmann-based free-surface solver for GPU architectures. The massively parallel execution especially requires special techniques to keep the interface region consistent, which is here addressed by a novel multipass method. We further compare different memory layouts according to their performance for both a basic driven cavity implementation and the free-surface method, pointing out the capabilities of our implementation in real-time and interactive scenarios, and shortly present visualizations of the flow, obtained in real-time
Octrees for Cooperative Work in a Network-Based Environment
Assuring global consistency in a cooperative working environment is the main focus of many nowaday research projects in the field of civil engineering and others. In this paper, a new approach based on octrees will be discussed. It will be shown that by the usage of octrees not only the management and control of processes in a network-based working environment can be optimised but also an efficient integration platform for processes from various disciplines – such as architecture and civil engineering – can be provided. By means of an octree-based collision detection resp. consistency assurance a client-server-architecture will be described as well as sophisticated information services for a further support of cooperative work
Multi-fidelity Constrained Optimization for Stochastic Black Box Simulators
Constrained optimization of the parameters of a simulator plays a crucial
role in a design process. These problems become challenging when the simulator
is stochastic, computationally expensive, and the parameter space is
high-dimensional. One can efficiently perform optimization only by utilizing
the gradient with respect to the parameters, but these gradients are
unavailable in many legacy, black-box codes. We introduce the algorithm
Scout-Nd (Stochastic Constrained Optimization for N dimensions) to tackle the
issues mentioned earlier by efficiently estimating the gradient, reducing the
noise of the gradient estimator, and applying multi-fidelity schemes to further
reduce computational effort. We validate our approach on standard benchmarks,
demonstrating its effectiveness in optimizing parameters highlighting better
performance compared to existing methods
Efficient Quantification of Model Uncertainties When De-boarding a Train
It is difficult to provide live simulation systems for decision support. Time is limited and uncertainty quantification requires many simulation runs. We combine a surrogate model with the stochastic collocation method to overcome time and storage restrictions and show a proof of concept for a de-boarding scenario of a train
- …