507 research outputs found
On the synthesis of integral and dynamic recurrences
PhD ThesisSynthesis techniques for regular arrays provide a disciplined and well-founded approach to
the design of classes of parallel algorithms. The design process is guided by a methodology
which is based upon a formal notation and transformations.
The mathematical model underlying synthesis techniques is that of affine Euclidean geometry
with embedded lattice spaces. Because of this model, computationally powerful methods
are provided as an effective way of engineering regular arrays. However, at present the
applicability of such methods is limited to so-called affine problems.
The work presented in this thesis aims at widening the applicability of standard synthesis
methods to more general classes of problems. The major contributions of this thesis are the
characterisation of classes of integral and dynamic problems, and the provision of techniques
for their systematic treatment within the framework of established synthesis methods. The
basic idea is the transformation of the initial algorithm specification into a specification
with data dependencies of increased regularity, so that corresponding regular arrays can be
obtained by a direct application of the standard mapping techniques.
We will complement the formal development of the techniques with the illustration of a
number of case studies from the literature.EPSR
Beyond shared memory loop parallelism in the polyhedral model
2013 Spring.Includes bibliographical references.With the introduction of multi-core processors, motivated by power and energy concerns, parallel processing has become main-stream. Parallel programming is much more difficult due to its non-deterministic nature, and because of parallel programming bugs that arise from non-determinacy. One solution is automatic parallelization, where it is entirely up to the compiler to efficiently parallelize sequential programs. However, automatic parallelization is very difficult, and only a handful of successful techniques are available, even after decades of research. Automatic parallelization for distributed memory architectures is even more problematic in that it requires explicit handling of data partitioning and communication. Since data must be partitioned among multiple nodes that do not share memory, the original memory allocation of sequential programs cannot be directly used. One of the main contributions of this dissertation is the development of techniques for generating distributed memory parallel code with parametric tiling. Our approach builds on important contributions to the polyhedral model, a mathematical framework for reasoning about program transformations. We show that many affine control programs can be uniformized only with simple techniques. Being able to assume uniform dependences significantly simplifies distributed memory code generation, and also enables parametric tiling. Our approach implemented in the AlphaZ system, a system for prototyping analyses, transformations, and code generators in the polyhedral model. The key features of AlphaZ are memory re-allocation, and explicit representation of reductions. We evaluate our approach on a collection of polyhedral kernels from the PolyBench suite, and show that our approach scales as well as PLuTo, a state-of-the-art shared memory automatic parallelizer using the polyhedral model. Automatic parallelization is only one approach to dealing with the non-deterministic nature of parallel programming that leaves the difficulty entirely to the compiler. Another approach is to develop novel parallel programming languages. These languages, such as X10, aim to provide highly productive parallel programming environment by including parallelism into the language design. However, even in these languages, parallel bugs remain to be an important issue that hinders programmer productivity. Another contribution of this dissertation is to extend the array dataflow analysis to handle a subset of X10 programs. We apply the result of dataflow analysis to statically guarantee determinism. Providing static guarantees can significantly increase programmer productivity by catching questionable implementations at compile-time, or even while programming
Recommended from our members
Synthesis of multi-rate arrays from directional uniform recurrence equations
Advances in VLSI array processing have led to many new
parallel structures for real-time Digital Signal Processing (DSP)
applications. Among all the architectures, systolic arrays have played
an important role because systolic arrays have regular, local
interconnections with modular structure. In ordinary systolic arrays,
however, all processing operations and data transmissions use the same
time clock, which degenerates the speed performance when different
processing operations take different time to execute. Moreover, the
application scope of systolic arrays is restricted to Uniform
Recurrence Equations (URE), which is not applicable to all DSP
algorithms.
This thesis introduces a new type of array processor
architecture, Multi-rate Arrays (MRA). MRAs enhance the computation
speed by assigning different clocks to different processing operations.
They also enlarge the application scope of systolic arrays because MRAs
can be synthesized from Directional Uniform Recurrence Equations
(DURE), which is a more general form than URE.
The thesis first introduces the idea of MRA, demonstrates its
enhancement on speed performance over systolic arrays, and gives a
criterion in choosing different array structures. It then relates MRAs
with DURE by analyzing the characteristics of DURE and formulates a
systematic procedure for the synthesis of MRA from DURE.At last, it
demonstrates the synthesis of MRA by two examples in DSP
applications- --decimation filter and Toeplitz matrix factorization
- …