Search CORE

2,474 research outputs found

New bounds on the edge-bandwidth of triangular grids

Author: Lin Lan
Lin Yixun
Publication venue
Publication date: 01/01/2015
Field of study

The edge-bandwidth of a graph G is the bandwidth of the line graph of G. Determining the edge-bandwidth B′(Tn) of triangular grids Tn is an open problem posed in 2006. Previously, an upper bound and an asymptotic lower bound were found to be 3n − 1 and 3n − o(n) respectively. In this paper we provide a lower bound 3n − ⌈ n/ 2 ⌉ and show that it gives the exact values of B′(Tn) for 1 ≤ n ≤ 8 and n = 10. Also, we show the upper bound 3n − 5 for n ≥ 10

EDP Sciences OAI-PMH repository (1.2.0)

Numérisation de Documents Anciens Mathématiques

A 3D Parallel Algorithm for QR Decomposition

Author: Ballard Grey
Demmel James
Grigori Laura
Jacquelin Mathias
Knight Nicholas
Publication venue
Publication date: 14/05/2018
Field of study

Interprocessor communication often dominates the runtime of large matrix computations. We present a parallel algorithm for computing QR decompositions whose bandwidth cost (communication volume) can be decreased at the cost of increasing its latency cost (number of messages). By varying a parameter to navigate the bandwidth/latency tradeoff, we can tune this algorithm for machines with different communication costs

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

GPU-accelerated discontinuous Galerkin methods on hybrid meshes

Author: Chan Jesse
Modave Axel
Remacle Jean-Francois
Wang Zheng
Warburton T.
Publication venue: 'Elsevier BV'
Publication date: 09/07/2015
Field of study

We present a time-explicit discontinuous Galerkin (DG) solver for the time-domain acoustic wave equation on hybrid meshes containing vertex-mapped hexahedral, wedge, pyramidal and tetrahedral elements. Discretely energy-stable formulations are presented for both Gauss-Legendre and Gauss-Legendre-Lobatto (Spectral Element) nodal bases for the hexahedron. Stable timestep restrictions for hybrid meshes are derived by bounding the spectral radius of the DG operator using order-dependent constants in trace and Markov inequalities. Computational efficiency is achieved under a combination of element-specific kernels (including new quadrature-free operators for the pyramid), multi-rate timestepping, and acceleration using Graphics Processing Units.Comment: Submitted to CMAM

arXiv.org e-Print Archive

DIAL UCLouvain

Some Key Developments in Computational Electromagnetics and their Attribution

Author: Sykulski J.K.
Trowbridge C.W.
Publication venue
Publication date: 01/04/2006
Field of study

Key developments in computational electromagnetics are proposed. Historical highlights are summarized concentrating on the two main approaches of differential and integral methods. This is seen as timely as a retrospective analysis is needed to minimize duplication and to help settle questions of attribution

Southampton (e-Prints Soton)

Unstructured mesh algorithms for aerodynamic calculations

Author: Mavriplis D. J.
Publication venue
Publication date
Field of study

The use of unstructured mesh techniques for solving complex aerodynamic flows is discussed. The principle advantages of unstructured mesh strategies, as they relate to complex geometries, adaptive meshing capabilities, and parallel processing are emphasized. The various aspects required for the efficient and accurate solution of aerodynamic flows are addressed. These include mesh generation, mesh adaptivity, solution algorithms, convergence acceleration, and turbulence modeling. Computations of viscous turbulent two-dimensional flows and inviscid three-dimensional flows about complex configurations are demonstrated. Remaining obstacles and directions for future research are also outlined

NASA Technical Reports Server

An Approximately Optimal Algorithm for Scheduling Phasor Data Transmissions in Smart Grid Networks

Author: Khargonekar P. P.
Nagananda K. G.
Publication venue
Publication date: 25/04/2015
Field of study

In this paper, we devise a scheduling algorithm for ordering transmission of synchrophasor data from the substation to the control center in as short a time frame as possible, within the realtime hierarchical communications infrastructure in the electric grid. The problem is cast in the framework of the classic job scheduling with precedence constraints. The optimization setup comprises the number of phasor measurement units (PMUs) to be installed on the grid, a weight associated with each PMU, processing time at the control center for the PMUs, and precedence constraints between the PMUs. The solution to the PMU placement problem yields the optimum number of PMUs to be installed on the grid, while the processing times are picked uniformly at random from a predefined set. The weight associated with each PMU and the precedence constraints are both assumed known. The scheduling problem is provably NP-hard, so we resort to approximation algorithms which provide solutions that are suboptimal yet possessing polynomial time complexity. A lower bound on the optimal schedule is derived using branch and bound techniques, and its performance evaluated using standard IEEE test bus systems. The scheduling policy is power grid-centric, since it takes into account the electrical properties of the network under consideration.Comment: 8 pages, published in IEEE Transactions on Smart Grid, October 201

arXiv.org e-Print Archive

eScholarship - University of California

Minimizing Communication in Linear Algebra

Author: Blackford L. S.
Grey Ballard
James Demmel
Oded Schwartz
Olga Holtz
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2009
Field of study

In 1981 Hong and Kung proved a lower bound on the amount of communication needed to perform dense, matrix-multiplication using the conventional

O(n^3)

algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it to the parallel case. In both cases the lower bound may be expressed as

\Omega

(#arithmetic operations /

\sqrt{M}

), where M is the size of the fast memory (or local memory in the parallel case). Here we generalize these results to a much wider variety of algorithms, including LU factorization, Cholesky factorization,

LDL^T

factorization, QR factorization, algorithms for eigenvalues and singular values, i.e., essentially all direct methods of linear algebra. The proof works for dense or sparse matrices, and for sequential or parallel algorithms. In addition to lower bounds on the amount of data moved (bandwidth) we get lower bounds on the number of messages required to move it (latency). We illustrate how to extend our lower bound technique to compositions of linear algebra operations (like computing powers of a matrix), to decide whether it is enough to call a sequence of simpler optimal algorithms (like matrix multiplication) to minimize communication, or if we can do better. We give examples of both. We also show how to extend our lower bounds to certain graph theoretic problems. We point out recently designed algorithms for dense LU, Cholesky, QR, eigenvalue and the SVD problems that attain these lower bounds; implementations of LU and QR show large speedups over conventional linear algebra algorithms in standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table

arXiv.org e-Print Archive

CiteSeerX

Crossref