1,875 research outputs found
Proxy compilation for Java via a code migration technique
There is an increasing trend that intermediate representations (IRs) are used to deliver programs in more and more languages, such as Java. Although Java can provide many advantages, including a wider portability and better optimisation opportunities on execution, it introduces extra overhead by requiring an IR translation for the program execution. For maximum execution performance, an optimising compiler is placed in the runtime to selectively optimise code regions regarded as “hotspots”. This common approach has been effectively deployed in many implementation of programming languages. However, the computational resources demanded by this approach made it less efficient, or even difficult to deploy directly in a resourceconstrained environment. One implementation approach is to use a remote compilation technique to support compilation during the execution. The work presented in this dissertation supports the thesis that execution performance can be improved by the use of efficient optimising compilation by using a proxy dynamic optimising compiler. After surveying various approaches to the design and implementation of remote compilation, a proxy compilation system called Apus is defined. To demonstrate the effectiveness of using a dynamic optimising compiler as a proxy compiler, a complete proxy compilation system is written based on a research-oriented Java VirtualMachine (JVM). The proxy compilation system is discussed in detail, showing how to deliver remote binaries and manage a cache of binaries by using a code migration approach. The proxy compilation client shows how the proxy compilation service is integrated with the selective optimisation system to maximise execution performance. The results of empirical measurements of the system are given, showing the efficiency of code optimisation from either the proxy compilation service and a local binary cache. The conclusion of this work is that Java execution performance can be improved by efficient optimising compilation with a proxy compilation service by using a code migration technique
Efficient Customizable Middleware
The rather large feature set of current Distributed Object Computing (DOC) middleware can be a liability for certain applications which have a need for only a certain subset of these features but have to suffer performance degradation and code bloat due to all the present features. To address this concern, a unique approach to building fully customizable middleware was undertaken in FACET, a CORBA event channel written using AspectJ. FACET consists of a small, essential core that represents the basic structure and functionality of an event channel into which additional features are woven using aspects so that the resulting event channel supports all of the features needed by a given embedded application. However, the use of CORBA as the underlying transport mechanism may make FACET unsuitable for use in small-scale embedded systems because of the considerable footprint of many ORBs. In this thesis, we describe how the use of CORBA in the event channel can be made an optional feature in building highly efficient middle-ware. We look at the challenges that arise in abstracting the method invocation layer, document design patterns discovered and present quantitative footprint, throughput performance data and analysis. We also examine the problem of integrating FACET, written in Java, into the Boeing Open Experimental Platform (OEP), written in C++, in order to serve as a replacement for the TAO Real-Time Event Channel (RTEC). We evaluate the available alternatives in building such an implementation for efficiency, describe our use of a native-code compiler for Java, gcj, and present data on the efficacy of this approach. Finally, we take preliminary look into the problem of efficiently testing middleware with a large number of highly granular features. Since the number of possible combinations grow exponentially, building and testing all possible combinations quickly becomes impractical. To address this, we examine the conditions under which features are non-interfering. Non-interfering features will only need to be tested in isolation removing the need to test features in combination thus reducing the intractability of the problem
Micro Virtual Machines: A Solid Foundation for Managed Language Implementation
Today new programming languages proliferate, but many of them
suffer from
poor performance and inscrutable semantics. We assert that the
root of
many of the performance and semantic problems of today's
languages is
that language implementation is extremely difficult. This
thesis
addresses the fundamental challenges of efficiently developing
high-level
managed languages.
Modern high-level languages provide abstractions over execution,
memory
management and concurrency. It requires enormous intellectual
capability
and engineering effort to properly manage these concerns.
Lacking such
resources, developers usually choose naive implementation
approaches
in the early stages of language design, a strategy which too
often has
long-term consequences, hindering the future development of the
language. Existing language development platforms have failed
to
provide the right level of abstraction, and forced implementers
to
reinvent low-level mechanisms in order to obtain performance.
My thesis is that the introduction of micro virtual machines will
allow
the development of higher-quality, high-performance managed
languages.
The first contribution of this thesis is the design of Mu, with
the
specification of Mu as the main outcome. Mu is
the first micro virtual machine, a robust, performant, and
light-weight
abstraction over just three concerns: execution, concurrency and
garbage
collection. Such a foundation attacks three of the most
fundamental and
challenging issues that face existing language designs and
implementations, leaving the language implementers free to focus
on the
higher levels of their language design.
The second contribution is an in-depth analysis of on-stack
replacement
and its efficient implementation. This low-level mechanism
underpins
run-time feedback-directed optimisation, which is key to the
efficient
implementation of dynamic languages.
The third contribution is demonstrating the viability of Mu
through
RPython, a real-world non-trivial language implementation. We
also did
some preliminary research of GHC as a Mu client.
We have created the Mu specification and its reference
implementation,
both of which are open-source. We show that that Mu's on-stack
replacement API can gracefully support dynamic languages such as
JavaScript, and it is implementable on concrete hardware. Our
RPython
client has been able to translate and execute non-trivial
RPython
programs, and can run the RPySOM interpreter and the core of the
PyPy
interpreter.
With micro virtual machines providing a low-level substrate,
language
developers now have the option to build their next language on a
micro
virtual machine. We believe that the quality of programming
languages
will be improved as a result
Effective memory management for mobile environments
Smartphones, tablets, and other mobile devices exhibit vastly different constraints compared to regular or classic computing environments like desktops, laptops, or servers. Mobile devices run dozens of so-called “apps” hosted by independent virtual machines (VM). All these VMs run concurrently and each VM deploys purely local heuristics to organize resources like memory, performance, and power. Such a design causes conflicts across all layers of the software stack, calling for the evaluation of VMs and the optimization techniques specific for mobile frameworks.
In this dissertation, we study the design of managed runtime systems for mobile platforms. More specifically, we deepen the understanding of interactions between garbage collection (GC) and system layers. We develop tools to monitor the memory behavior of Android-based apps and to characterize GC performance, leading to the development of new techniques for memory management that address energy constraints, time performance, and responsiveness.
We implement a GC-aware frequency scaling governor for Android devices. We also explore the tradeoffs of power and performance in vivo for a range of realistic GC variants, with established benchmarks and real applications running on Android virtual machines. We control for variation due to dynamic voltage and frequency scaling (DVFS), Just-in-time (JIT) compilation, and across established dimensions of heap memory size and concurrency. Finally, we provision GC as a global service that collects statistics from all running VMs and then makes an informed decision that optimizes across all them (and not just locally), and across all layers of the stack.
Our evaluation illustrates the power of such a central coordination service and garbage collection mechanism in improving memory utilization, throughput, and adaptability to user activities. In fact, our techniques aim at a sweet spot, where total on-chip energy is reduced (20–30%) with minimal impact on throughput and responsiveness (5–10%). The simplicity and efficacy of our approach reaches well beyond the usual optimization techniques
A Language and Hardware Independent Approach to Quantum-Classical Computing
Heterogeneous high-performance computing (HPC) systems offer novel
architectures which accelerate specific workloads through judicious use of
specialized coprocessors. A promising architectural approach for future
scientific computations is provided by heterogeneous HPC systems integrating
quantum processing units (QPUs). To this end, we present XACC (eXtreme-scale
ACCelerator) --- a programming model and software framework that enables
quantum acceleration within standard or HPC software workflows. XACC follows a
coprocessor machine model that is independent of the underlying quantum
computing hardware, thereby enabling quantum programs to be defined and
executed on a variety of QPUs types through a unified application programming
interface. Moreover, XACC defines a polymorphic low-level intermediate
representation, and an extensible compiler frontend that enables language
independent quantum programming, thus promoting integration and
interoperability across the quantum programming landscape. In this work we
define the software architecture enabling our hardware and language independent
approach, and demonstrate its usefulness across a range of quantum computing
models through illustrative examples involving the compilation and execution of
gate and annealing-based quantum programs
Acceleration and semantic foundations of embedded Java platforms
Tableau d'honneur de la Faculté des études supérieures et postdoctorales, 2006-200
Life of occam-Pi
This paper considers some questions prompted by a brief review of the history of computing. Why is programming so hard? Why is concurrency considered an “advanced” subject? What’s the matter with Objects? Where did all the Maths go? In searching for answers, the paper looks at some concerns over fundamental ideas within object orientation (as represented by modern programming languages), before focussing on the concurrency model of communicating processes and its particular expression in the occam family of languages. In that focus, it looks at the history of occam, its underlying philosophy (Ockham’s Razor), its semantic foundation on Hoare’s CSP, its principles of process oriented design and its development over almost three decades into occam-? (which blends in the concurrency dynamics of Milner’s ?-calculus). Also presented will be an urgent need for rationalisation – occam-? is an experiment that has demonstrated significant results, but now needs time to be spent on careful review and implementing the conclusions of that review. Finally, the future is considered. In particular, is there a future
- …