2,474 research outputs found
New bounds on the edge-bandwidth of triangular grids
The edge-bandwidth of a graph G is the bandwidth of the line graph of G. Determining the edge-bandwidth B′(Tn) of triangular grids Tn is an open problem posed in 2006. Previously, an upper bound and an asymptotic lower bound were found to be 3n − 1 and 3n − o(n) respectively. In this paper we provide a lower bound 3n − ⌈ n/ 2 ⌉ and show that it gives the exact values of B′(Tn) for 1 ≤ n ≤ 8 and n = 10. Also, we show the upper bound 3n − 5 for n ≥ 10
A 3D Parallel Algorithm for QR Decomposition
Interprocessor communication often dominates the runtime of large matrix
computations. We present a parallel algorithm for computing QR decompositions
whose bandwidth cost (communication volume) can be decreased at the cost of
increasing its latency cost (number of messages). By varying a parameter to
navigate the bandwidth/latency tradeoff, we can tune this algorithm for
machines with different communication costs
GPU-accelerated discontinuous Galerkin methods on hybrid meshes
We present a time-explicit discontinuous Galerkin (DG) solver for the
time-domain acoustic wave equation on hybrid meshes containing vertex-mapped
hexahedral, wedge, pyramidal and tetrahedral elements. Discretely energy-stable
formulations are presented for both Gauss-Legendre and Gauss-Legendre-Lobatto
(Spectral Element) nodal bases for the hexahedron. Stable timestep restrictions
for hybrid meshes are derived by bounding the spectral radius of the DG
operator using order-dependent constants in trace and Markov inequalities.
Computational efficiency is achieved under a combination of element-specific
kernels (including new quadrature-free operators for the pyramid), multi-rate
timestepping, and acceleration using Graphics Processing Units.Comment: Submitted to CMAM
Some Key Developments in Computational Electromagnetics and their Attribution
Key developments in computational electromagnetics are proposed. Historical highlights are summarized concentrating on the two main approaches of differential and integral methods. This is seen as timely as a retrospective analysis is needed to minimize duplication and to help settle questions of attribution
Unstructured mesh algorithms for aerodynamic calculations
The use of unstructured mesh techniques for solving complex aerodynamic flows is discussed. The principle advantages of unstructured mesh strategies, as they relate to complex geometries, adaptive meshing capabilities, and parallel processing are emphasized. The various aspects required for the efficient and accurate solution of aerodynamic flows are addressed. These include mesh generation, mesh adaptivity, solution algorithms, convergence acceleration, and turbulence modeling. Computations of viscous turbulent two-dimensional flows and inviscid three-dimensional flows about complex configurations are demonstrated. Remaining obstacles and directions for future research are also outlined
An Approximately Optimal Algorithm for Scheduling Phasor Data Transmissions in Smart Grid Networks
In this paper, we devise a scheduling algorithm for ordering transmission of
synchrophasor data from the substation to the control center in as short a time
frame as possible, within the realtime hierarchical communications
infrastructure in the electric grid. The problem is cast in the framework of
the classic job scheduling with precedence constraints. The optimization setup
comprises the number of phasor measurement units (PMUs) to be installed on the
grid, a weight associated with each PMU, processing time at the control center
for the PMUs, and precedence constraints between the PMUs. The solution to the
PMU placement problem yields the optimum number of PMUs to be installed on the
grid, while the processing times are picked uniformly at random from a
predefined set. The weight associated with each PMU and the precedence
constraints are both assumed known. The scheduling problem is provably NP-hard,
so we resort to approximation algorithms which provide solutions that are
suboptimal yet possessing polynomial time complexity. A lower bound on the
optimal schedule is derived using branch and bound techniques, and its
performance evaluated using standard IEEE test bus systems. The scheduling
policy is power grid-centric, since it takes into account the electrical
properties of the network under consideration.Comment: 8 pages, published in IEEE Transactions on Smart Grid, October 201
Minimizing Communication in Linear Algebra
In 1981 Hong and Kung proved a lower bound on the amount of communication
needed to perform dense, matrix-multiplication using the conventional
algorithm, where the input matrices were too large to fit in the small, fast
memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and
extended it to the parallel case. In both cases the lower bound may be
expressed as (#arithmetic operations / ), where M is the size
of the fast memory (or local memory in the parallel case). Here we generalize
these results to a much wider variety of algorithms, including LU
factorization, Cholesky factorization, factorization, QR factorization,
algorithms for eigenvalues and singular values, i.e., essentially all direct
methods of linear algebra. The proof works for dense or sparse matrices, and
for sequential or parallel algorithms. In addition to lower bounds on the
amount of data moved (bandwidth) we get lower bounds on the number of messages
required to move it (latency). We illustrate how to extend our lower bound
technique to compositions of linear algebra operations (like computing powers
of a matrix), to decide whether it is enough to call a sequence of simpler
optimal algorithms (like matrix multiplication) to minimize communication, or
if we can do better. We give examples of both. We also show how to extend our
lower bounds to certain graph theoretic problems.
We point out recently designed algorithms for dense LU, Cholesky, QR,
eigenvalue and the SVD problems that attain these lower bounds; implementations
of LU and QR show large speedups over conventional linear algebra algorithms in
standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table
- …