200 research outputs found
A rigorous analysis using optimal transport theory for a two-reflector design problem with a point source
We consider the following geometric optics problem: Construct a system of two
reflectors which transforms a spherical wavefront generated by a point source
into a beam of parallel rays. This beam has a prescribed intensity
distribution. We give a rigorous analysis of this problem. The reflectors we
construct are (parts of) the boundaries of convex sets. We prove existence of
solutions for a large class of input data and give a uniqueness result. To the
author's knowledge, this is the first time that a rigorous mathematical
analysis of this problem is given. The approach is based on optimal
transportation theory. It yields a practical algorithm for finding the
reflectors. Namely, the problem is equivalent to a constrained linear
optimization problem.Comment: 5 Figures - pdf files attached to submission, but not shown in
manuscrip
DiBELLA: Distributed long read to long read alignment
We present a parallel algorithm and scalable implementation for genome analysis, specifically the problem of finding overlaps and alignments for data from "third generation" long read sequencers [29]. While long sequences of DNA offer enormous advantages for biological analysis and insight, current long read sequencing instruments have high error rates and therefore require different approaches to analysis than their short read counterparts. Our work focuses on an efficient distributed-memory parallelization of an accurate single-node algorithm for overlapping and aligning long reads. We achieve scalability of this irregular algorithm by addressing the competing issues of increasing parallelism, minimizing communication, constraining the memory footprint, and ensuring good load balance. The resulting application, diBELLA, is the first distributed memory overlapper and aligner specifically designed for long reads and parallel scalability. We describe and present analyses for high level design trade-offs and conduct an extensive empirical analysis that compares performance characteristics across state-of-the-art HPC systems as well as a commercial cloud architectures, highlighting the advantages of state-of-the-art network technologies
Two problems related to prescribed curvature measures
Existence of convex body with prescribed generalized curvature measures is
discussed, this result is obtained by making use of Guan-Li-Li's innovative
techniques. In surprise, that methods has also brought us to promote
Ivochkina's estimates for prescribed curvature equation in \cite{I1, I}.Comment: 12 pages, Corrected typo
Compiler-Directed Transformation for Higher-Order Stencils
As the cost of data movement increasingly dominates performance, developers of finite-volume and finite-difference solutions for partial differential equations (PDEs) are exploring novel higher-order stencils that increase numerical accuracy and computational intensity. This paper describes a new compiler reordering transformation applied to stencil operators that performs partial sums in buffers, and reuses the partial sums in computing multiple results. This optimization has multiple effect son improving stencil performance that are particularly important to higher-order stencils: exploits data reuse, reduces floating-point operations, and exposes efficient SIMD parallelism to backend compilers. We study the benefit of this optimization in the context of Geometric Multigrid (GMG), a widely used method to solvePDEs, using four different Jacobi smoothers built from 7-, 13-, 27-and 125-point stencils. We quantify performance, speedup, andnumerical accuracy, and use the Roofline model to qualify our results. Ultimately, we obtain over 4Ă— speedup on the smoothers themselves and up to a 3Ă— speedup on the multigrid solver. Finally, we demonstrate that high-order multigrid solvers have the potential of reducing total data movement and energy by several orders of magnitude
Fuchsian convex bodies: basics of Brunn--Minkowski theory
The hyperbolic space \H^d can be defined as a pseudo-sphere in the
Minkowski space-time. In this paper, a Fuchsian group is a group of
linear isometries of the Minkowski space such that \H^d/\Gamma is a compact
manifold. We introduce Fuchsian convex bodies, which are closed convex sets in
Minkowski space, globally invariant for the action of a Fuchsian group. A
volume can be associated to each Fuchsian convex body, and, if the group is
fixed, Minkowski addition behaves well. Then Fuchsian convex bodies can be
studied in the same manner as convex bodies of Euclidean space in the classical
Brunn--Minkowski theory. For example, support functions can be defined, as
functions on a compact hyperbolic manifold instead of the sphere.
The main result is the convexity of the associated volume (it is log concave
in the classical setting). This implies analogs of Alexandrov--Fenchel and
Brunn--Minkowski inequalities. Here the inequalities are reversed
Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor
The Roofline Performance Model is a visually intuitive method used to bound the sustained peak floating-point performance of any given arithmetic kernel on any given processor architecture. In the Roofline, performance is nominally measured in floating-point operations per second as a function of arithmetic intensity (operations per byte of data). In this study we determine the Roofline for the Intel Knights Landing (KNL) processor, determining the sustained peak memory bandwidth and floating-point performance for all levels of the memory hierarchy, in all the different KNL cluster modes.We then determine arithmetic intensity and performance for a suite of application kernels being targeted for the KNL based supercomputer Cori, and make comparisons to current Intel Xeon processors. Cori is the National Energy Research Scientific Computing Center’s (NERSC) next generation supercomputer. Scheduled for deployment mid-2016, it will be one of the earliest and largest KNL deployments in the world
- …