140 research outputs found
CAREER: Automated software understanding for retargeting embedded image processing software for data parallel execution
Issued as final reportNational Science Foundation (U.S.
Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
MapReduce is a popular programming paradigm for developing large-scale,
data-intensive computation. Many frameworks that implement this paradigm have
recently been developed. To leverage these frameworks, however, developers must
become familiar with their APIs and rewrite existing code. Casper is a new tool
that automatically translates sequential Java programs into the MapReduce
paradigm. Casper identifies potential code fragments to rewrite and translates
them in two steps: (1) Casper uses program synthesis to search for a program
summary (i.e., a functional specification) of each code fragment. The summary
is expressed using a high-level intermediate language resembling the MapReduce
paradigm and verified to be semantically equivalent to the original using a
theorem prover. (2) Casper generates executable code from the summary, using
either the Hadoop, Spark, or Flink API. We evaluated Casper by automatically
converting real-world, sequential Java benchmarks to MapReduce. The resulting
benchmarks perform up to 48.2x faster compared to the original.Comment: 12 pages, additional 4 pages of references and appendi
Vector Operation Support for Transport Triggered Architectures
High performance and low power consumption requirements usually restrict the design process of embedded processors. Traditional design solutions do not apply to the requirements today, but instead demands exploiting varying levels of parallelism. In order to reduce design time and effort, a powerful toolset is required to design new parallel processors effectively.
TTA-based Co-design Environment (TCE) is a toolset developed in Tampere University of Technology for designing customized parallel processors. It is based on a modular Transport Triggered Architecture (TTA) processor architecture template, which provides easy customization and allows exploiting instruction-level parallelism for high performance execution.
Single Instruction, Multiple Data (SIMD) paradigm provides powerful data-level parallel vector computation for many applications in embedded processing. It is one of the most common ways to exploit parallelism in today's processor designs in order to gain greater execution efficiency and, therefore, to meet the performance requirements.
This work describes how data-level parallel SIMD support is introduced and integrated to the TCE design flow for more diverse parallelism support. The support allows designers to customize and program processors with wide vector operations. The work presents the required modification points along with the new tools that were added to the toolset. Much weight is given for the retargetable compiler, which must be able to adapt to all resources on TTA machines. The added tools were required to provide as much automatic behavior as possible to maintain effective design flow. In addition, the thesis presents how the modifications and new features were verified
Analysis and transformation of legacy code
Hardware evolves faster than software. While a hardware system might need replacement
every one to five years, the average lifespan of a software system is a decade,
with some instances living up to several decades. Inevitably, code outlives the platform
it was developed for and may become legacy: development of the software stops,
but maintenance has to continue to keep up with the evolving ecosystem. No new features
are added, but the software is still used to fulfil its original purpose. Even in the
cases where it is still functional (which discourages its replacement), legacy code is
inefficient, costly to maintain, and a risk to security.
This thesis proposes methods to leverage the expertise put in the development of
legacy code and to extend its useful lifespan, rather than to throw it away. A novel
methodology is proposed, for automatically exploiting platform specific optimisations
when retargeting a program to another platform. The key idea is to leverage the optimisation
information embedded in vector processing intrinsic functions. The performance
of the resulting code is shown to be close to the performance of manually
retargeted programs, however with the human labour removed.
Building on top of that, the question of discovering optimisation information when
there are no hints in the form of intrinsics or annotations is investigated. This thesis
postulates that such information can potentially be extracted from profiling the data
flow during executions of the program. A context-aware data dependence profiling
system is described, detailing previously overlooked aspects in related research. The
system is shown to be essential in surpassing the information that can be inferred statically,
in particular about loop iterators.
Loop iterators are the controlling part of a loop. This thesis describes and evaluates
a system for extracting the loop iterators in a program. It is found to significantly
outperform previously known techniques and further increases the amount of information
about the structure of a program that is available to a compiler. Combining this
system with data dependence profiling improves its results even more. Loop iterator
recognition enables other code modernising techniques, like source code rejuvenation
and commutativity analysis. The former increases the use of idiomatic code and as
a result increases the maintainability of the program. The latter can potentially drive
parallelisation and thus dramatically improve runtime performance
Simplifying Cyber Foraging for Mobile Devices
Cyber foraging is the transient and opportunistic use of compute servers by mobile devices. The short market life of such devices makes rapid modification of applications for remote execution an important problem. We describe a solution that combines a “little language ” for cyber foraging with an adaptive runtime system. We report results from a user study showing that even novice developers are able to successfully modify large, unfamiliar applications in just a few hours. We also show that the quality of novice-modified and expert-modified applications are comparable in most cases. Categories and Subject Descriptor
- …