24,682 research outputs found

    VXA: A Virtual Architecture for Durable Compressed Archives

    Full text link
    Data compression algorithms change frequently, and obsolete decoders do not always run on new hardware and operating systems, threatening the long-term usability of content archived using those algorithms. Re-encoding content into new formats is cumbersome, and highly undesirable when lossy compression is involved. Processor architectures, in contrast, have remained comparatively stable over recent decades. VXA, an archival storage system designed around this observation, archives executable decoders along with the encoded content it stores. VXA decoders run in a specialized virtual machine that implements an OS-independent execution environment based on the standard x86 architecture. The VXA virtual machine strictly limits access to host system services, making decoders safe to run even if an archive contains malicious code. VXA's adoption of a "native" processor architecture instead of type-safe language technology allows reuse of existing "hand-optimized" decoders in C and assembly language, and permits decoders access to performance-enhancing architecture features such as vector processing instructions. The performance cost of VXA's virtualization is typically less than 15% compared with the same decoders running natively. The storage cost of archived decoders, typically 30-130KB each, can be amortized across many archived files sharing the same compression method.Comment: 14 pages, 7 figures, 2 table

    Comparing Tag Scheme Variations Using an Abstract Machine Generator

    Get PDF
    In this paper we study, in the context of a WAM-based abstract machine for Prolog, how variations in the encoding of type information in tagged words and in their associated basic operations impact performance and memory usage. We use a high-level language to specify encodings and the associated operations. An automatic generator constructs both the abstract machine using this encoding and the associated Prolog-to-byte code compiler. Annotations in this language make it possible to impose constraints on the final representation of tagged words, such as the effectively addressable space (fixing, for example, the word size of the target processor /architecture), the layout of the tag and value bits inside the tagged word, and how the basic operations are implemented. We evaluate large number of combinations of the different parameters in two scenarios: a) trying to obtain an optimal general-purpose abstract machine and b) automatically generating a specially-tuned abstract machine for a particular program. We conclude that we are able to automatically generate code featuring all the optimizations present in a hand-written, highly-optimized abstract machine and we canal so obtain emulators with larger addressable space and better performance

    Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries

    Get PDF
    An immutable multi-map is a many-to-many thread-friendly map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied in static analysis of object-oriented systems. When processing such big data sets the memory overhead of the data structure encoding itself is a memory usage bottleneck. Motivated by reuse and type-safety, libraries for Java, Scala and Clojure typically implement immutable multi-maps by nesting sets as the values with the keys of a trie map. Like this, based on our measurements the expected byte overhead for a sparse multi-map per stored entry adds up to around 65B, which renders it unfeasible to compute with effectively on the JVM. In this paper we propose a general framework for Hash-Array Mapped Tries on the JVM which can store type-heterogeneous keys and values: a Heterogeneous Hash-Array Mapped Trie (HHAMT). Among other applications, this allows for a highly efficient multi-map encoding by (a) not reserving space for empty value sets and (b) inlining the values of singleton sets while maintaining a (c) type-safe API. We detail the necessary encoding and optimizations to mitigate the overhead of storing and retrieving heterogeneous data in a hash-trie. Furthermore, we evaluate HHAMT specifically for the application to multi-maps, comparing them to state-of-the-art encodings of multi-maps in Java, Scala and Clojure. We isolate key differences using microbenchmarks and validate the resulting conclusions on a real world case in static analysis. The new encoding brings the per key-value storage overhead down to 30B: a 2x improvement. With additional inlining of primitive values it reaches a 4x improvement

    The future of computing beyond Moore's Law.

    Get PDF
    Moore's Law is a techno-economic model that has enabled the information technology industry to double the performance and functionality of digital electronics roughly every 2 years within a fixed cost, power and area. Advances in silicon lithography have enabled this exponential miniaturization of electronics, but, as transistors reach atomic scale and fabrication costs continue to rise, the classical technological driver that has underpinned Moore's Law for 50 years is failing and is anticipated to flatten by 2025. This article provides an updated view of what a post-exascale system will look like and the challenges ahead, based on our most recent understanding of technology roadmaps. It also discusses the tapering of historical improvements, and how it affects options available to continue scaling of successors to the first exascale machine. Lastly, this article covers the many different opportunities and strategies available to continue computing performance improvements in the absence of historical technology drivers. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

    In defense of the progressive stack: A strategy for prioritizing marginalized voices during in-class discussion

    Get PDF
    Progressive stacking is a strategy for prioritizing in-class contributions that allows marginalized students to speak before non-marginalized students. I argue that this strategy is both pedagogically and ethically defensible. Pedagogically, it provides benefits to all students (e.g., expanded in-class discourse) while providing special benefits (e.g., increased self-efficacy) to marginalized students, helping to address historic educational inequalities. Ethically, I argue that neither marginalized nor non-marginalized students are wronged by such a policy. First, I present a strategy for self-disclosure that reduces the risk of inadvertent, unwanted disclosure while respecting marginalized student autonomy in a manner analogous to accommodations provided under the Americans with Disabilities Act. Second, I argue that non-marginalized students are not wronged because such students are not silenced during discussion and because non-marginalized students benefit from the prioritization of marginalized voices

    Simple optimizing JIT compilation of higher-order dynamic programming languages

    Get PDF
    ImplĂ©menter efficacement les langages de programmation dynamiques demande beaucoup d’effort de dĂ©veloppement. Les compilateurs ne cessent de devenir de plus en plus complexes. Aujourd’hui, ils incluent souvent une phase d’interprĂ©tation, plusieurs phases de compilation, plusieurs reprĂ©sentations intermĂ©diaires et des analyses de code. Toutes ces techniques permettent d’implĂ©menter efficacement un langage de programmation dynamique, mais leur mise en oeuvre est difficile dans un contexte oĂč les ressources de dĂ©veloppement sont limitĂ©es. Nous proposons une nouvelle approche et de nouvelles techniques dynamiques permettant de dĂ©velopper des compilateurs performants pour les langages dynamiques avec de relativement bonnes performances et un faible effort de dĂ©veloppement. Nous prĂ©sentons une approche simple de compilation Ă  la volĂ©e qui permet d’implĂ©menter un langage en une seule phase de compilation, sans transformation vers des reprĂ©sentations intermĂ©diaires. Nous expliquons comment le versionnement de blocs de base, une technique de compilation existante, peut ĂȘtre Ă©tendue, sans effort de dĂ©veloppement significatif, pour fonctionner interprocĂ©duralement avec les langages de programmation d’ordre supĂ©rieur, permettant d’appliquer des optimisations interprocĂ©durales sur ces langages. Nous expliquons Ă©galement comment le versionnement de blocs de base permet de supprimer certaines opĂ©rations utilisĂ©es pour implĂ©menter les langages dynamiques et qui impactent les performances comme les vĂ©rifications de type. Nous expliquons aussi comment les compilateurs peuvent exploiter les reprĂ©sentations dynamiques des valeurs par Tagging et NaN-boxing pour optimiser le code gĂ©nĂ©rĂ© avec peu d’effort de dĂ©veloppement. Nous prĂ©sentons Ă©galement notre expĂ©rience de dĂ©veloppement d’un compilateur Ă  la volĂ©e pour le langage de programmation Scheme, pour montrer que ces techniques permettent effectivement de construire un compilateur avec un effort moins important que les compilateurs actuels et qu’elles permettent de gĂ©nĂ©rer du code efficace, qui rivalise avec les meilleures implĂ©mentations du langage Scheme.Efficiently implementing dynamic programming languages requires a significant development effort. Over the years, compilers have become more complex. Today, they typically include an interpretation phase, several compilation phases, several intermediate representations and code analyses. These techniques allow efficiently implementing these programming languages but are difficult to implement in contexts in which development resources are limited. We propose a new approach and new techniques to build optimizing just-in-time compilers for dynamic languages with relatively good performance and low development effort. We present a simple just-in-time compilation approach to implement a language with a single compilation phase, without the need to use code transformations to intermediate representations. We explain how basic block versioning, an existing compilation technique, can be extended without significant development effort, to work interprocedurally with higherorder programming languages allowing interprocedural optimizations on these languages. We also explain how basic block versioning allows removing operations used to implement dynamic languages that degrade performance, such as type checks, and how compilers can use Tagging and NaN-boxing to optimize the generated code with low development effort. We present our experience of building a JIT compiler using these techniques for the Scheme programming language to show that they indeed allow building compilers with less development effort than other implementations and that they allow generating efficient code that competes with current mature implementations of the Scheme language
    • 

    corecore