22,285 research outputs found
GeNN: a code generation framework for accelerated brain simulations
Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ.
GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials,
Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/
Finding Competitive Network Architectures Within a Day Using UCT
The design of neural network architectures for a new data set is a laborious
task which requires human deep learning expertise. In order to make deep
learning available for a broader audience, automated methods for finding a
neural network architecture are vital. Recently proposed methods can already
achieve human expert level performances. However, these methods have run times
of months or even years of GPU computing time, ignoring hardware constraints as
faced by many researchers and companies. We propose the use of Monte Carlo
planning in combination with two different UCT (upper confidence bound applied
to trees) derivations to search for network architectures. We adapt the UCT
algorithm to the needs of network architecture search by proposing two ways of
sharing information between different branches of the search tree. In an
empirical study we are able to demonstrate that this method is able to find
competitive networks for MNIST, SVHN and CIFAR-10 in just a single GPU day.
Extending the search time to five GPU days, we are able to outperform human
architectures and our competitors which consider the same types of layers
Distributed-memory large deformation diffeomorphic 3D image registration
We present a parallel distributed-memory algorithm for large deformation
diffeomorphic registration of volumetric images that produces large isochoric
deformations (locally volume preserving). Image registration is a key
technology in medical image analysis. Our algorithm uses a partial differential
equation constrained optimal control formulation. Finding the optimal
deformation map requires the solution of a highly nonlinear problem that
involves pseudo-differential operators, biharmonic operators, and pure
advection operators both forward and back- ward in time. A key issue is the
time to solution, which poses the demand for efficient optimization methods as
well as an effective utilization of high performance computing resources. To
address this problem we use a preconditioned, inexact, Gauss-Newton- Krylov
solver. Our algorithm integrates several components: a spectral discretization
in space, a semi-Lagrangian formulation in time, analytic adjoints, different
regularization functionals (including volume-preserving ones), a spectral
preconditioner, a highly optimized distributed Fast Fourier Transform, and a
cubic interpolation scheme for the semi-Lagrangian time-stepping. We
demonstrate the scalability of our algorithm on images with resolution of up to
on the "Maverick" and "Stampede" systems at the Texas Advanced
Computing Center (TACC). The critical problem in the medical imaging
application domain is strong scaling, that is, solving registration problems of
a moderate size of ---a typical resolution for medical images. We are
able to solve the registration problem for images of this size in less than
five seconds on 64 x86 nodes of TACC's "Maverick" system.Comment: accepted for publication at SC16 in Salt Lake City, Utah, USA;
November 201
- …