24,682 research outputs found
VXA: A Virtual Architecture for Durable Compressed Archives
Data compression algorithms change frequently, and obsolete decoders do not
always run on new hardware and operating systems, threatening the long-term
usability of content archived using those algorithms. Re-encoding content into
new formats is cumbersome, and highly undesirable when lossy compression is
involved. Processor architectures, in contrast, have remained comparatively
stable over recent decades. VXA, an archival storage system designed around
this observation, archives executable decoders along with the encoded content
it stores. VXA decoders run in a specialized virtual machine that implements an
OS-independent execution environment based on the standard x86 architecture.
The VXA virtual machine strictly limits access to host system services, making
decoders safe to run even if an archive contains malicious code. VXA's adoption
of a "native" processor architecture instead of type-safe language technology
allows reuse of existing "hand-optimized" decoders in C and assembly language,
and permits decoders access to performance-enhancing architecture features such
as vector processing instructions. The performance cost of VXA's virtualization
is typically less than 15% compared with the same decoders running natively.
The storage cost of archived decoders, typically 30-130KB each, can be
amortized across many archived files sharing the same compression method.Comment: 14 pages, 7 figures, 2 table
Comparing Tag Scheme Variations Using an Abstract Machine Generator
In this paper we study, in the context of a WAM-based abstract machine for Prolog, how variations in the encoding of type information in tagged words and in their associated basic operations impact performance and memory usage. We use a high-level language to specify encodings and the associated operations. An automatic generator constructs both the abstract machine using this encoding and the associated Prolog-to-byte code compiler. Annotations in this language make it possible to impose constraints on the final representation of tagged words, such as the effectively addressable space (fixing, for example, the word size of the target processor /architecture), the layout of the tag and value bits inside the tagged word, and how the basic operations are implemented. We evaluate large number of combinations of the different parameters in two scenarios: a) trying to obtain an optimal general-purpose abstract machine and b) automatically generating a specially-tuned abstract machine for a particular program. We conclude that we are able to automatically generate code featuring all the optimizations present in a hand-written, highly-optimized abstract machine and we canal so obtain emulators with larger addressable space and better performance
Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries
An immutable multi-map is a many-to-many thread-friendly map data structure
with expected fast insert and lookup operations. This data structure is used
for applications processing graphs or many-to-many relations as applied in
static analysis of object-oriented systems. When processing such big data sets
the memory overhead of the data structure encoding itself is a memory usage
bottleneck. Motivated by reuse and type-safety, libraries for Java, Scala and
Clojure typically implement immutable multi-maps by nesting sets as the values
with the keys of a trie map. Like this, based on our measurements the expected
byte overhead for a sparse multi-map per stored entry adds up to around 65B,
which renders it unfeasible to compute with effectively on the JVM.
In this paper we propose a general framework for Hash-Array Mapped Tries on
the JVM which can store type-heterogeneous keys and values: a Heterogeneous
Hash-Array Mapped Trie (HHAMT). Among other applications, this allows for a
highly efficient multi-map encoding by (a) not reserving space for empty value
sets and (b) inlining the values of singleton sets while maintaining a (c)
type-safe API.
We detail the necessary encoding and optimizations to mitigate the overhead
of storing and retrieving heterogeneous data in a hash-trie. Furthermore, we
evaluate HHAMT specifically for the application to multi-maps, comparing them
to state-of-the-art encodings of multi-maps in Java, Scala and Clojure. We
isolate key differences using microbenchmarks and validate the resulting
conclusions on a real world case in static analysis. The new encoding brings
the per key-value storage overhead down to 30B: a 2x improvement. With
additional inlining of primitive values it reaches a 4x improvement
The future of computing beyond Moore's Law.
Moore's Law is a techno-economic model that has enabled the information technology industry to double the performance and functionality of digital electronics roughly every 2 years within a fixed cost, power and area. Advances in silicon lithography have enabled this exponential miniaturization of electronics, but, as transistors reach atomic scale and fabrication costs continue to rise, the classical technological driver that has underpinned Moore's Law for 50 years is failing and is anticipated to flatten by 2025. This article provides an updated view of what a post-exascale system will look like and the challenges ahead, based on our most recent understanding of technology roadmaps. It also discusses the tapering of historical improvements, and how it affects options available to continue scaling of successors to the first exascale machine. Lastly, this article covers the many different opportunities and strategies available to continue computing performance improvements in the absence of historical technology drivers. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
In defense of the progressive stack: A strategy for prioritizing marginalized voices during in-class discussion
Progressive stacking is a strategy for prioritizing in-class contributions that allows marginalized students to speak before non-marginalized students. I argue that this strategy is both pedagogically and ethically defensible. Pedagogically, it provides benefits to all students (e.g., expanded in-class discourse) while providing special benefits (e.g., increased self-efficacy) to marginalized students, helping to address historic educational inequalities. Ethically, I argue that neither marginalized nor non-marginalized students are wronged by such a policy. First, I present a strategy for self-disclosure that reduces the risk of inadvertent, unwanted disclosure while respecting marginalized student autonomy in a manner analogous to accommodations provided under the Americans with Disabilities Act. Second, I argue that non-marginalized students are not wronged because such students are not silenced during discussion and because non-marginalized students benefit from the prioritization of marginalized voices
Simple optimizing JIT compilation of higher-order dynamic programming languages
ImplĂ©menter efficacement les langages de programmation dynamiques demande beaucoup dâeffort de dĂ©veloppement.
Les compilateurs ne cessent de devenir de plus en plus complexes.
Aujourdâhui, ils incluent souvent une phase dâinterprĂ©tation, plusieurs phases de compilation, plusieurs reprĂ©sentations intermĂ©diaires et des analyses de code. Toutes ces techniques permettent dâimplĂ©menter efficacement un langage de programmation dynamique, mais leur mise en oeuvre est difficile dans un contexte oĂč les ressources de dĂ©veloppement sont limitĂ©es.
Nous proposons une nouvelle approche et de nouvelles techniques dynamiques permettant de développer des compilateurs performants pour les langages dynamiques avec de relativement bonnes performances et un faible effort de développement.
Nous prĂ©sentons une approche simple de compilation Ă la volĂ©e qui permet dâimplĂ©menter un langage en une seule phase de compilation, sans transformation vers des reprĂ©sentations intermĂ©diaires.
Nous expliquons comment le versionnement de blocs de base, une technique de compilation existante, peut ĂȘtre Ă©tendue, sans effort de dĂ©veloppement significatif, pour fonctionner interprocĂ©duralement avec les langages de programmation dâordre supĂ©rieur, permettant dâappliquer des optimisations interprocĂ©durales sur ces langages.
Nous expliquons également comment le versionnement de blocs de base permet de supprimer certaines opérations utilisées pour implémenter les langages dynamiques et qui impactent les performances comme les vérifications de type.
Nous expliquons aussi comment les compilateurs peuvent exploiter les reprĂ©sentations dynamiques des valeurs par Tagging et NaN-boxing pour optimiser le code gĂ©nĂ©rĂ© avec peu dâeffort de dĂ©veloppement.
Nous prĂ©sentons Ă©galement notre expĂ©rience de dĂ©veloppement dâun compilateur Ă la volĂ©e pour le langage de programmation Scheme, pour montrer que ces techniques permettent effectivement de construire un compilateur avec un effort moins important que les compilateurs actuels et quâelles permettent de gĂ©nĂ©rer du code efficace, qui rivalise avec les meilleures implĂ©mentations du langage Scheme.Efficiently implementing dynamic programming languages requires a significant development
effort. Over the years, compilers have become more complex. Today, they typically include
an interpretation phase, several compilation phases, several intermediate representations and
code analyses. These techniques allow efficiently implementing these programming languages
but are difficult to implement in contexts in which development resources are limited. We
propose a new approach and new techniques to build optimizing just-in-time compilers for
dynamic languages with relatively good performance and low development effort.
We present a simple just-in-time compilation approach to implement a language with
a single compilation phase, without the need to use code transformations to intermediate
representations. We explain how basic block versioning, an existing compilation technique,
can be extended without significant development effort, to work interprocedurally with higherorder
programming languages allowing interprocedural optimizations on these languages. We
also explain how basic block versioning allows removing operations used to implement dynamic
languages that degrade performance, such as type checks, and how compilers can use Tagging
and NaN-boxing to optimize the generated code with low development effort. We present our
experience of building a JIT compiler using these techniques for the Scheme programming
language to show that they indeed allow building compilers with less development effort
than other implementations and that they allow generating efficient code that competes with
current mature implementations of the Scheme language
- âŠ