5,576 research outputs found
Efficient Explicit Time Stepping of High Order Discontinuous Galerkin Schemes for Waves
This work presents algorithms for the efficient implementation of
discontinuous Galerkin methods with explicit time stepping for acoustic wave
propagation on unstructured meshes of quadrilaterals or hexahedra. A crucial
step towards efficiency is to evaluate operators in a matrix-free way with
sum-factorization kernels. The method allows for general curved geometries and
variable coefficients. Temporal discretization is carried out by low-storage
explicit Runge-Kutta schemes and the arbitrary derivative (ADER) method. For
ADER, we propose a flexible basis change approach that combines cheap face
integrals with cell evaluation using collocated nodes and quadrature points.
Additionally, a degree reduction for the optimized cell evaluation is presented
to decrease the computational cost when evaluating higher order spatial
derivatives as required in ADER time stepping. We analyze and compare the
performance of state-of-the-art Runge-Kutta schemes and ADER time stepping with
the proposed optimizations. ADER involves fewer operations and additionally
reaches higher throughput by higher arithmetic intensities and hence decreases
the required computational time significantly. Comparison of Runge-Kutta and
ADER at their respective CFL stability limit renders ADER especially beneficial
for higher orders when the Butcher barrier implies an overproportional amount
of stages. Moreover, vector updates in explicit Runge--Kutta schemes are shown
to take a substantial amount of the computational time due to their memory
intensity
Improving multivariate Horner schemes with Monte Carlo tree search
Optimizing the cost of evaluating a polynomial is a classic problem in
computer science. For polynomials in one variable, Horner's method provides a
scheme for producing a computationally efficient form. For multivariate
polynomials it is possible to generalize Horner's method, but this leaves
freedom in the order of the variables. Traditionally, greedy schemes like
most-occurring variable first are used. This simple textbook algorithm has
given remarkably efficient results. Finding better algorithms has proved
difficult. In trying to improve upon the greedy scheme we have implemented
Monte Carlo tree search, a recent search method from the field of artificial
intelligence. This results in better Horner schemes and reduces the cost of
evaluating polynomials, sometimes by factors up to two.Comment: 5 page
Computing Real Roots of Real Polynomials ... and now For Real!
Very recent work introduces an asymptotically fast subdivision algorithm,
denoted ANewDsc, for isolating the real roots of a univariate real polynomial.
The method combines Descartes' Rule of Signs to test intervals for the
existence of roots, Newton iteration to speed up convergence against clusters
of roots, and approximate computation to decrease the required precision. It
achieves record bounds on the worst-case complexity for the considered problem,
matching the complexity of Pan's method for computing all complex roots and
improving upon the complexity of other subdivision methods by several
magnitudes.
In the article at hand, we report on an implementation of ANewDsc on top of
the RS root isolator. RS is a highly efficient realization of the classical
Descartes method and currently serves as the default real root solver in Maple.
We describe crucial design changes within ANewDsc and RS that led to a
high-performance implementation without harming the theoretical complexity of
the underlying algorithm.
With an excerpt of our extensive collection of benchmarks, available online
at http://anewdsc.mpi-inf.mpg.de/, we illustrate that the theoretical gain in
performance of ANewDsc over other subdivision methods also transfers into
practice. These experiments also show that our new implementation outperforms
both RS and mature competitors by magnitudes for notoriously hard instances
with clustered roots. For all other instances, we avoid almost any overhead by
integrating additional optimizations and heuristics.Comment: Accepted for presentation at the 41st International Symposium on
Symbolic and Algebraic Computation (ISSAC), July 19--22, 2016, Waterloo,
Ontario, Canad
Parallel ADMM for robust quadratic optimal resource allocation problems
An alternating direction method of multipliers (ADMM) solver is described for
optimal resource allocation problems with separable convex quadratic costs and
constraints and linear coupling constraints. We describe a parallel
implementation of the solver on a graphics processing unit (GPU) using a
bespoke quartic function minimizer. An application to robust optimal energy
management in hybrid electric vehicles is described, and the results of
numerical simulations comparing the computation times of the parallel GPU
implementation with those of an equivalent serial implementation are presented
An Incremental Algorithm for Computing Cylindrical Algebraic Decompositions
In this paper, we propose an incremental algorithm for computing cylindrical
algebraic decompositions. The algorithm consists of two parts: computing a
complex cylindrical tree and refining this complex tree into a cylindrical tree
in real space. The incrementality comes from the first part of the algorithm,
where a complex cylindrical tree is constructed by refining a previous complex
cylindrical tree with a polynomial constraint. We have implemented our
algorithm in Maple. The experimentation shows that the proposed algorithm
outperforms existing ones for many examples taken from the literature
- …