31,807 research outputs found
Complex Block Floating-Point Format with Box Encoding For Wordlength Reduction in Communication Systems
We propose a new complex block floating-point format to reduce implementation
complexity. The new format achieves wordlength reduction by sharing an exponent
across the block of samples, and uses box encoding for the shared exponent to
reduce quantization error. Arithmetic operations are performed on blocks of
samples at time, which can also reduce implementation complexity. For a case
study of a baseband quadrature amplitude modulation (QAM) transmitter and
receiver, we quantify the tradeoffs in signal quality vs. implementation
complexity using the new approach to represent IQ samples. Signal quality is
measured using error vector magnitude (EVM) in the receiver, and implementation
complexity is measured in terms of arithmetic complexity as well as memory
allocation and memory input/output rates. The primary contributions of this
paper are (1) a complex block floating-point format with box encoding of the
shared exponent to reduce quantization error, (2) arithmetic operations using
the new complex block floating-point format, and (3) a QAM transceiver case
study to quantify signal quality vs. implementation complexity tradeoffs using
the new format and arithmetic operations.Comment: 6 pages, 9 figures, submitted to Asilomar Conference on Signals,
Systems, and Computers 201
On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective
We implement and benchmark parallel I/O methods for the fully-manycore driven
particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as
a major challenge for applications on today's and future HPC systems, we
present a scaling law characterizing performance bottlenecks in
state-of-the-art approaches for data reduction. Consequently, we propose,
implement and verify multi-threaded data-transformations for the I/O library
ADIOS as a feasible way to trade underutilized host-side compute potential on
heterogeneous systems for reduced I/O latency.Comment: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'1
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
We present a sparse linear system solver that is based on a multifrontal
variant of Gaussian elimination, and exploits low-rank approximation of the
resulting dense frontal matrices. We use hierarchically semiseparable (HSS)
matrices, which have low-rank off-diagonal blocks, to approximate the frontal
matrices. For HSS matrix construction, a randomized sampling algorithm is used
together with interpolative decompositions. The combination of the randomized
compression with a fast ULV HSS factorization leads to a solver with lower
computational complexity than the standard multifrontal method for many
applications, resulting in speedups up to 7 fold for problems in our test
suite. The implementation targets many-core systems by using task parallelism
with dynamic runtime scheduling. Numerical experiments show performance
improvements over state-of-the-art sparse direct solvers. The implementation
achieves high performance and good scalability on a range of modern shared
memory parallel systems, including the Intel Xeon Phi (MIC). The code is part
of a software package called STRUMPACK -- STRUctured Matrices PACKage, which
also has a distributed memory component for dense rank-structured matrices
Demonstration of a coupled floating offshore wind turbine analysis with high-fidelity methods
This paper presents results of numerical computations for floating off-shore wind turbines using, as an example, a machine of 10-MW rated power. The aerodynamic loads on the rotor are computed using the Helicopter Multi-Block flow solver developed at the University of Liverpool. The method solves the Navier–Stokes equations in integral form using the arbitrary Lagrangian–Eulerian formulation for time-dependent domains with moving boundaries. Hydrodynamic loads on the support platform are computed using the Smoothed Particle Hydrodynamics method, which is mesh-free and represents the water and floating structures by a set of discrete elements, referred to as particles. The motion of the floating offshore wind turbine is computed using a Multi-Body Dynamic Model of rigid bodies and frictionless joints. Mooring cables are modelled as a set of springs and dampers. All solvers were validated separately before coupling, and the results are presented in this paper. The importance of coupling is assessed and the loosely coupled algorithm used is described in detail alongside the obtained results
- …