1,480 research outputs found
Optimization of micropillar sequences for fluid flow sculpting
Inertial fluid flow deformation around pillars in a microchannel is a new
method for controlling fluid flow. Sequences of pillars have been shown to
produce a rich phase space with a wide variety of flow transformations.
Previous work has successfully demonstrated manual design of pillar sequences
to achieve desired transformations of the flow cross-section, with experimental
validation. However, such a method is not ideal for seeking out complex
sculpted shapes as the search space quickly becomes too large for efficient
manual discovery. We explore fast, automated optimization methods to solve this
problem. We formulate the inertial flow physics in microchannels with different
micropillar configurations as a set of state transition matrix operations.
These state transition matrices are constructed from experimentally validated
streamtraces. This facilitates modeling the effect of a sequence of
micropillars as nested matrix-matrix products, which have very efficient
numerical implementations. With this new forward model, arbitrary micropillar
sequences can be rapidly simulated with various inlet configurations, allowing
optimization routines quick access to a large search space. We integrate this
framework with the genetic algorithm and showcase its applicability by
designing micropillar sequences for various useful transformations. We
computationally discover micropillar sequences for complex transformations that
are substantially shorter than manually designed sequences. We also determine
sequences for novel transformations that were difficult to manually design.
Finally, we experimentally validate these computational designs by fabricating
devices and comparing predictions with the results from confocal microscopy
Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations
We propose Paraiso, a domain specific language embedded in functional
programming language Haskell, for automated tuning of explicit solvers of
partial differential equations (PDEs) on GPUs as well as multicore CPUs. In
Paraiso, one can describe PDE solving algorithms succinctly using tensor
equations notation. Hydrodynamic properties, interpolation methods and other
building blocks are described in abstract, modular, re-usable and combinable
forms, which lets us generate versatile solvers from little set of Paraiso
source codes.
We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A
single source code less than 500 lines can be used to generate solvers of
arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both
manual annotation based tuning and evolutionary computing based automated
tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational
Science and Discover
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern
computing applications. Accelerating their training is a major challenge and
techniques range from distributed algorithms to low-level circuit design. In
this survey, we describe the problem from a theoretical perspective, followed
by approaches for its parallelization. We present trends in DNN architectures
and the resulting implications on parallelization strategies. We then review
and model the different types of concurrency in DNNs: from the single operator,
through parallelism in network inference and training, to distributed deep
learning. We discuss asynchronous stochastic optimization, distributed system
architectures, communication schemes, and neural architecture search. Based on
those approaches, we extrapolate potential directions for parallelism in deep
learning
EvoX: A Distributed GPU-accelerated Library towards Scalable Evolutionary Computation
During the past decades, evolutionary computation (EC) has demonstrated
promising potential in solving various complex optimization problems of
relatively small scales. Nowadays, however, ongoing developments in modern
science and engineering are bringing increasingly grave challenges to the
conventional EC paradigm in terms of scalability. As problem scales increase,
on the one hand, the encoding spaces (i.e., dimensions of the decision vectors)
are intrinsically larger; on the other hand, EC algorithms often require
growing numbers of function evaluations (and probably larger population sizes
as well) to work properly. To meet such emerging challenges, not only does it
require delicate algorithm designs, but more importantly, a high-performance
computing framework is indispensable. Hence, we develop a distributed
GPU-accelerated algorithm library -- EvoX. First, we propose a generalized
workflow for implementing general EC algorithms. Second, we design a scalable
computing framework for running EC algorithms on distributed GPU devices.
Third, we provide user-friendly interfaces to both researchers and
practitioners for benchmark studies as well as extended real-world
applications. To comprehensively assess the performance of EvoX, we conduct a
series of experiments, including: (i) scalability test via numerical
optimization benchmarks with problem dimensions/population sizes up to
millions; (ii) acceleration test via a neuroevolution task with multiple GPU
nodes; (iii) extensibility demonstration via the application to reinforcement
learning tasks on the OpenAI Gym. The code of EvoX is available at
https://github.com/EMI-Group/EvoX
- …