939 research outputs found
Finding Legal Reordering Transformations using Mappings
Traditionally, optimizing compilers attempt to improve the performance of
programs by applying source to source transformations, such as loop
interchange, loop skewing and loop distribution. Each of these
transformations has its own special legality checks and transformation rules
which make it hard to analyze or predict the effects of compositions of
these transformations. To overcome these problems we have developed a
framework for unifying iteration reordering transformations. The framework
is based on the idea that all reordering transformation can be represented
as a mapping from the original iteration space to a new iteration space.
The framework is designed to provide a uniform way to represent and reason
about transformations. An optimizing compiler would use our framework by
finding a mapping that both corresponds to a legal transformation and
produces efficient code. We present the mapping selection problem as a
search problem by decomposing it into a sequence of smaller choices.
We then characterize the set of all legal mappings by defining an implicit
search tree.
(Also cross-referenced as UMIACS-TR-94-71
A Unifying Framework for Iteration Reordering Transformations
We present a framework for unifying iteration reordering transformations
such as loop interchange, loop distribution, skewing, tiling, index set
splitting and statement reordering. The framework is based on the idea
that a transformation can be represented as a mapping from the original
iteration space to a new iteration space. The framework is designed to
provide a uniform way to represent and reason about transformations. We
also provide algorithms to test the legality of mappings, and to generate
optimized code for mappings.
(Also cross-referenced as UMIACS-TR-95-30
Optimization within a Unified Transformation Framework
Programmers typically want to write scientific programs in a high level
language with semantics based on a sequential execution model. To execute
efficiently on a parallel machine, however, a program typically needs to
contain explicit parallelism and possibly explicit communication and
synchronization. So, we need compilers to convert programs from the first
of these forms to the second. There are two basic choices to be made when
parallelizing a program. First, the computations of the program need to be
distributed amongst the set of available processors. Second, the computations
on each processor need to be ordered. My contribution has been the development
of simple mathematical abstractions for representing these choices and the
development of new algorithms for making these choices. I have developed a new
framework that achieves good performance by minimizing communication between
processors, minimizing the time processors spend waiting for messages from
other processors, and ordering data accesses so as to exploit the memory
hierarchy. This framework can be used by optimizing compilers, as well as by
interactive transformation tools. The state of the art for vectorizing
compilers is already quite good, but much work remains to bring parallelizing
compilers up to the same standard. The main contribution of my work can be
summarized as improving this situation by replacing existing ad hoc
parallelization techniques with a sound underlying foundation on which future
work can be built.
(Also cross-referenced as UMIACS-TR-96-93
Code Generation for Multiple Mappings
There has been a great amount of recent work toward unifying
iteration reordering transformations. Many of these approaches represent
transformations as affine mappings from the original iteration space to a
new iteration space. These approaches show a great deal of promise, but
they all rely on the ability to generate code that iterates over the
points in these new iteration spaces in the appropriate order. This
problem has been fairly well-studied in the case where all statements use
the same mapping. We have developed an algorithm for the less
well-studied case where each statement uses a potentially different
mapping. Unlike many other approaches, our algorithm can also generate
code from mappings corresponding to loop blocking. We address the
important trade-off between reducing control overhead and duplicating
code.
(Also cross-referenced as UMIACS-TR-94-87.1
Ernst Denert Award for Software Engineering 2019
This open access book provides an overview of the dissertations of the five nominees for the Ernst Denert Award for Software Engineering in 2019. The prize, kindly sponsored by the Gerlind & Ernst Denert Stiftung, is awarded for excellent work within the discipline of Software Engineering, which includes methods, tools and procedures for better and efficient development of high quality software. An essential requirement for the nominated work is its applicability and usability in industrial practice. The book contains five papers describing the works by Sebastian Baltes (U Trier) on Software Developers’Work Habits and Expertise, Timo Greifenberg’s thesis on Artefaktbasierte Analyse modellgetriebener Softwareentwicklungsprojekte, Marco Konersmann’s (U Duisburg-Essen) work on Explicitly Integrated Architecture, Marija Selakovic’s (TU Darmstadt) research about Actionable Program Analyses for Improving Software Performance, and Johannes Späth’s (Paderborn U) thesis on Synchronized Pushdown Systems for Pointer and Data-Flow Analysis – which actually won the award. The chapters describe key findings of the respective works, show their relevance and applicability to practice and industrial software engineering projects, and provide additional information and findings that have only been discovered afterwards, e.g. when applying the results in industry. This way, the book is not only interesting to other researchers, but also to industrial software professionals who would like to learn about the application of state-of-the-art methods in their daily work
A theoretical foundation for program transformations to reduce cache thrashing due to true data sharing
AbstractCache thrashing due to true data sharing can degrade the performance of parallel programs significantly. Our previous work showed that parallel task alignment via program transformations can be quite effective for the reduction of such cache thrashing. In this paper, we present a theoretical foundation for such program transformations. Based on linear algebra and the theory of numbers, our work analyzes the data dependences among the tasks created by a fork-join parallel program and determines at compile time how these tasks should be assigned to processors in order to reduce cache thrashing due to true data sharing. Our analysis and program transformations can be easily performed by compilers for parallel computers
Toward an architecture for quantum programming
It is becoming increasingly clear that, if a useful device for quantum
computation will ever be built, it will be embodied by a classical computing
machine with control over a truly quantum subsystem, this apparatus performing
a mixture of classical and quantum computation.
This paper investigates a possible approach to the problem of programming
such machines: a template high level quantum language is presented which
complements a generic general purpose classical language with a set of quantum
primitives. The underlying scheme involves a run-time environment which
calculates the byte-code for the quantum operations and pipes it to a quantum
device controller or to a simulator.
This language can compactly express existing quantum algorithms and reduce
them to sequences of elementary operations; it also easily lends itself to
automatic, hardware independent, circuit simplification. A publicly available
preliminary implementation of the proposed ideas has been realized using the
C++ language.Comment: 23 pages, 5 figures, A4paper. Final version accepted by EJPD ("swap"
replaced by "invert" for Qops). Preliminary implementation available at:
http://sra.itc.it/people/serafini/quantum-computing/qlang.htm
A Review of Analog Audio Scrambling Methods for Residual Intelligibility
In this paper, a review of the techniques available in different categories of audio scrambling schemes is done with respect to Residual Intelligibility. According to Shannon's secure communication theory, for the residual intelligibility to be zero the scrambled signal must represent a white signal. Thus the scrambling scheme that has zero residual intelligibility is said to be highly secure. Many analog audio scrambling algorithms that aim to achieve lower levels of residual intelligibility are available. In this paper a review of all the existing analog audio scrambling algorithms proposed so far and their properties and limitations has been presented. The aim of this paper is to provide an insight for evaluating various analog audio scrambling schemes available up-to-date. The review shows that the algorithms have their strengths and weaknesses and there is no algorithm that satisfies all the factors to the maximum extent. Keywords: residual Intelligibility, audio scrambling, speech scramblin
- …