40,802 research outputs found
Simple and Effective Type Check Removal through Lazy Basic Block Versioning
Dynamically typed programming languages such as JavaScript and Python defer
type checking to run time. In order to maximize performance, dynamic language
VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between
compilation time and resulting precision. This has lead to the creation of
increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation
technique which effectively removes redundant type checks from critical code
paths. This novel approach lazily generates type-specialized versions of basic
blocks on-the-fly while propagating context-dependent type information. This
does not require the use of costly program analyses, is not restricted by the
precision limitations of traditional type analyses and avoids the
implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a
JavaScript JIT compiler. This approach is compared with a classical flow-based
type analysis. Lazy basic block versioning performs as well or better on all
benchmarks. On average, 71% of type tests are eliminated, yielding speedups of
up to 50%. We also show that our implementation generates more efficient
machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on
several benchmarks. The combination of implementation simplicity, low
algorithmic complexity and good run time performance makes basic block
versioning attractive for baseline JIT compilers
Automatically Discovering Hidden Transformation Chaining Constraints
Model transformations operate on models conforming to precisely defined
metamodels. Consequently, it often seems relatively easy to chain them: the
output of a transformation may be given as input to a second one if metamodels
match. However, this simple rule has some obvious limitations. For instance, a
transformation may only use a subset of a metamodel. Therefore, chaining
transformations appropriately requires more information. We present here an
approach that automatically discovers more detailed information about actual
chaining constraints by statically analyzing transformations. The objective is
to provide developers who decide to chain transformations with more data on
which to base their choices. This approach has been successfully applied to the
case of a library of endogenous transformations. They all have the same source
and target metamodel but have some hidden chaining constraints. In such a case,
the simple metamodel matching rule given above does not provide any useful
information
Energy-efficient and high-performance lock speculation hardware for embedded multicore systems
Embedded systems are becoming increasingly common in everyday life and like their general-purpose counterparts, they have shifted towards shared memory multicore architectures. However, they are much more resource constrained, and as they often run on batteries, energy efficiency becomes critically important. In such systems, achieving high concurrency is a key demand for delivering satisfactory performance at low energy cost. In order to achieve this high concurrency, consistency across the shared memory hierarchy must be accomplished in a cost-effective manner in terms of performance, energy, and implementation complexity. In this article, we propose Embedded-Spec, a hardware solution for supporting transparent lock speculation, without the requirement for special supporting instructions. Using this approach, we evaluate the energy consumption and performance of a suite of benchmarks, exploring a range of contention management and retry policies. We conclude that for resource-constrained platforms, lock speculation can provide real benefits in terms of improved concurrency and energy efficiency, as long as the underlying hardware support is carefully configured.This work is supported in part by NSF under Grants CCF-0903384, CCF-0903295, CNS-1319495, and CNS-1319095 as well the Semiconductor Research Corporation under grant number 1983.001. (CCF-0903384 - NSF; CCF-0903295 - NSF; CNS-1319495 - NSF; CNS-1319095 - NSF; 1983.001 - Semiconductor Research Corporation
The Resource constrained shortest path problem implemented in a lazy functional language
The resource constrained shortest path problem is an NP-hard problem for which many ingenious algorithms have been developed. These algorithms are usually implemented in FORTRAN or another imperative programming language. We have implemented some of the simpler algorithms in a lazy functional language. Benefits accrue in the software engineering of the implementations. Our implementations have been applied to a standard benchmark of data files, which is available from the Operational Research Library of Imperial College, London. The performance of the lazy functional implementations, even with the comparatively simple algorithms that we have used, is competitive with a reference FORTRAN implementation
The second law and beyond in microscopic quantum setups
The Clausius inequality (CI) is one of the most versatile forms of the second
law. Although it was originally conceived for macroscopic steam engines, it is
also applicable to quantum single particle machines. Moreover, the CI is the
main connecting thread between classical microscopic thermodynamics and
nanoscopic quantum thermodynamics. In this chapter, we study three different
approaches for obtaining the CI. Each approach shows different aspects of the
CI. The goals of this chapter are: (i) To show the exact assumptions made in
various derivations of the CI. (ii) To elucidate the structure of the second
law and its origin. (iii) To discuss the possibilities each approach offers for
finding additional second-law like inequalities. (iv) To pose challenges
related to the second law in nanoscopic setups. In particular, we introduce and
briefly discuss the notions of exotic heat machines (X machines), and "lazy
demons".Comment: As a chapter of: F. Binder, L. A. Correa, C. Gogolin, J. Anders, and
G. Adesso (eds.), "Thermodynamics in the quantum regime - Recent Progress and
Outlook", (Springer International Publishing). v1 does not include references
to other book chapter
Structural Optimisation: Biomechanics of the Femur
A preliminary iterative 3D meso-scale structural model of the femur was
developed, in which bar and shell elements were used to represent trabecular
and cortical bone respectively. The cross-sectional areas of the bar elements
and the thickness values of the shell elements were adjusted over successive
iterations of the model based on a target strain stimulus, resulting in an
optimised construct. The predicted trabecular architecture, and cortical
thickness distribution showed good agreement with clinical observations, based
on the application of a single leg stance load case during gait. The benefit of
using a meso-scale structural approach in comparison to micro or macro-scale
continuum approaches to predictive bone modelling was achievement of the
symbiotic goals of computational efficiency and structural description of the
femur.Comment: Accepted by Engineering and Computational Mechanics (Proceedings of
the ICE
Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries
This work introduces a runtime model for managing communication with support
for latency-hiding. The model enables non-computer science researchers to
exploit communication latency-hiding techniques seamlessly. For compiled
languages, it is often possible to create efficient schedules for
communication, but this is not the case for interpreted languages. By
maintaining data dependencies between scheduled operations, it is possible to
aggressively initiate communication and lazily evaluate tasks to allow maximal
time for the communication to finish before entering a wait state. We implement
a heuristic of this model in DistNumPy, an auto-parallelizing version of
numerical Python that allows sequential NumPy programs to run on distributed
memory architectures. Furthermore, we present performance comparisons for eight
benchmarks with and without automatic latency-hiding. The results shows that
our model reduces the time spent on waiting for communication as much as 27
times, from a maximum of 54% to only 2% of the total execution time, in a
stencil application.Comment: PREPRIN
The significance of the reflective practitioner in blended learning
This is a case study paper concerned with the introduction of blended learning on a part-time higher education programme for mature students. The interpretive work draws on four action research cycles conducted over two years with two student cohorts. Discussion is based on observations, staff and student focus groups and interviews examining the studentsâ expectations and experiences.
The initial focus of the action research was on the introduction of technology into the teaching and learning experience. The great advantage of an interpretive approach, however, is to allow the findings to determine the course of the research. During the first action research cycles, the focus of the research soon changed from the use of technology in blended learning to the role of the practitioners involved. We advocate the key role of reflective practitioners in facilitating blended learning and suggest that action research is a useful framework to develop this.
Keywords: Educational Technology Implementation, Electronic Learning (E-Learning), Teacher Improvement, Teacher Preparation, Action Research, Asynchronous Communications, blended learning, part-time student, reflective practice
- âŠ