384 research outputs found
Block-Jacobi sweeping preconditioners for optimized Schwarz methods applied to the Helmholtz equation
The parallel performances of sweeping-type algorithms for high-frequency
time-harmonic wave problems have been recently improved by departing from
standard layer-type domain decomposition and introducing a new sweeping
strategy on a checkerboard-type domain decomposition, where sweeps can be
performed more flexibly. These sweeps can be done by a certain number of steps,
each of which provides the necessary information from subdomains on which
solutions have been obtained to their next neighboring subdomains. Although,
subproblems in these subdomains can be solved concurrently at each step, the
sequential nature of the process of the sweeping approaches still exists, which
limits their parallel performances. Moreover, the sweeping approaches can be
interpreted as a completely approximate LU factorization, which implies a huge
computation cost. We propose block-Jacobi sweeping preconditioners, which are
improved variants of sweeping-type preconditioners. The new feature of these
improved variants can be interpreted as several partial sweeps, which can be
performed parallelly. We present several two- and three-dimensional finite
element results with constant and various wave speeds to study and compare the
original and block-Jacobi sweeping preconditioners
The method of polarized traces for the 2D Helmholtz equation
We present a solver for the 2D high-frequency Helmholtz equation in heterogeneous acoustic media, with online parallel complexity that scales optimally as O(NL), where N is the number of volume unknowns, and L is the number of processors, as long as L grows at most like a small fractional power of N. The solver decomposes the domain into layers, and uses transmission conditions in boundary integral form to explicitly define "polarized traces", i.e., up- and down-going waves sampled at interfaces. Local direct solvers are used in each layer to precompute traces of local Green's functions in an embarrassingly parallel way (the offline part), and incomplete Green's formulas are used to propagate interface data in a sweeping fashion, as a preconditioner inside a GMRES loop (the online part). Adaptive low-rank partitioning of the integral kernels is used to speed up their application to interface data. The method uses second-order finite differences. The complexity scalings are empirical but motivated by an analysis of ranks of off-diagonal blocks of oscillatory integrals. They continue to hold in the context of standard geophysical community models such as BP and Marmousi 2, where convergence occurs in 5 to 10 GMRES iterations. While the parallelism in this paper stems from decomposing the domain, we do not explore the alternative of parallelizing the systems solves with distributed linear algebra routines. Keywords: Domain decomposition; Helmholtz equation; Integral equations; High-frequency; Fast methodsUnited States. Air Force Office of Scientific Research (Grant FA9550-15-1-0078)United States. Office of Naval Research (Grant N00014-13-1-0403)National Science Foundation (U.S.) (Grant DMS-1255203
Efficient DSP and Circuit Architectures for Massive MIMO: State-of-the-Art and Future Directions
Massive MIMO is a compelling wireless access concept that relies on the use
of an excess number of base-station antennas, relative to the number of active
terminals. This technology is a main component of 5G New Radio (NR) and
addresses all important requirements of future wireless standards: a great
capacity increase, the support of many simultaneous users, and improvement in
energy efficiency. Massive MIMO requires the simultaneous processing of signals
from many antenna chains, and computational operations on large matrices. The
complexity of the digital processing has been viewed as a fundamental obstacle
to the feasibility of Massive MIMO in the past. Recent advances on
system-algorithm-hardware co-design have led to extremely energy-efficient
implementations. These exploit opportunities in deeply-scaled silicon
technologies and perform partly distributed processing to cope with the
bottlenecks encountered in the interconnection of many signals. For example,
prototype ASIC implementations have demonstrated zero-forcing precoding in real
time at a 55 mW power consumption (20 MHz bandwidth, 128 antennas, multiplexing
of 8 terminals). Coarse and even error-prone digital processing in the antenna
paths permits a reduction of consumption with a factor of 2 to 5. This article
summarizes the fundamental technical contributions to efficient digital signal
processing for Massive MIMO. The opportunities and constraints on operating on
low-complexity RF and analog hardware chains are clarified. It illustrates how
terminals can benefit from improved energy efficiency. The status of technology
and real-life prototypes discussed. Open challenges and directions for future
research are suggested.Comment: submitted to IEEE transactions on signal processin
Recommended from our members
Fast algorithms for frequency domain wave propagation
textHigh-frequency wave phenomena is observed in many physical settings, most notably in acoustics, electromagnetics, and elasticity. In all of these fields, numerical simulation and modeling of the forward propagation problem is important to the design and analysis of many systems; a few examples which rely on these computations are the development of
metamaterial technologies and geophysical prospecting for natural resources. There are two modes of modeling the forward problem: the frequency domain and the time domain. As the title states, this work is concerned with the former regime.
The difficulties of solving the high-frequency wave propagation problem accurately lies in the large number of degrees of freedom required. Conventional wisdom in the computational electromagnetics commmunity suggests that about 10 degrees of freedom per wavelength be used in each coordinate direction to resolve each oscillation. If K is the width of the domain in wavelengths, the number of unknowns N grows at least by O(K^2) for surface discretizations and O(K^3) for volume discretizations in 3D. The memory requirements and asymptotic complexity estimates of direct algorithms such as the multifrontal method are too costly for such problems. Thus, iterative solvers must be used. In this dissertation, I will present fast algorithms which, in conjunction with GMRES, allow the solution of the forward problem in O(N) or O(N log N) time.Computational Science, Engineering, and Mathematic
- …