3,822 research outputs found

    Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries

    Get PDF
    An immutable multi-map is a many-to-many thread-friendly map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied in static analysis of object-oriented systems. When processing such big data sets the memory overhead of the data structure encoding itself is a memory usage bottleneck. Motivated by reuse and type-safety, libraries for Java, Scala and Clojure typically implement immutable multi-maps by nesting sets as the values with the keys of a trie map. Like this, based on our measurements the expected byte overhead for a sparse multi-map per stored entry adds up to around 65B, which renders it unfeasible to compute with effectively on the JVM. In this paper we propose a general framework for Hash-Array Mapped Tries on the JVM which can store type-heterogeneous keys and values: a Heterogeneous Hash-Array Mapped Trie (HHAMT). Among other applications, this allows for a highly efficient multi-map encoding by (a) not reserving space for empty value sets and (b) inlining the values of singleton sets while maintaining a (c) type-safe API. We detail the necessary encoding and optimizations to mitigate the overhead of storing and retrieving heterogeneous data in a hash-trie. Furthermore, we evaluate HHAMT specifically for the application to multi-maps, comparing them to state-of-the-art encodings of multi-maps in Java, Scala and Clojure. We isolate key differences using microbenchmarks and validate the resulting conclusions on a real world case in static analysis. The new encoding brings the per key-value storage overhead down to 30B: a 2x improvement. With additional inlining of primitive values it reaches a 4x improvement

    Prioritized Garbage Collection: Explicit GC Support for Software Caches

    Full text link
    Programmers routinely trade space for time to increase performance, often in the form of caching or memoization. In managed languages like Java or JavaScript, however, this space-time tradeoff is complex. Using more space translates into higher garbage collection costs, especially at the limit of available memory. Existing runtime systems provide limited support for space-sensitive algorithms, forcing programmers into difficult and often brittle choices about provisioning. This paper presents prioritized garbage collection, a cooperative programming language and runtime solution to this problem. Prioritized GC provides an interface similar to soft references, called priority references, which identify objects that the collector can reclaim eagerly if necessary. The key difference is an API for defining the policy that governs when priority references are cleared and in what order. Application code specifies a priority value for each reference and a target memory bound. The collector reclaims references, lowest priority first, until the total memory footprint of the cache fits within the bound. We use this API to implement a space-aware least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement for existing caches, such as Google's Guava library. The garbage collector automatically grows and shrinks the Sache in response to available memory and workload with minimal provisioning information from the programmer. Using a Sache, it is almost impossible for an application to experience a memory leak, memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201

    Kevoree Modeling Framework (KMF): Efficient modeling techniques for runtime use

    Get PDF
    The creation of Domain Specific Languages(DSL) counts as one of the main goals in the field of Model-Driven Software Engineering (MDSE). The main purpose of these DSLs is to facilitate the manipulation of domain specific concepts, by providing developers with specific tools for their domain of expertise. A natural approach to create DSLs is to reuse existing modeling standards and tools. In this area, the Eclipse Modeling Framework (EMF) has rapidly become the defacto standard in the MDSE for building Domain Specific Languages (DSL) and tools based on generative techniques. However, the use of EMF generated tools in domains like Internet of Things (IoT), Cloud Computing or Models@Runtime reaches several limitations. In this paper, we identify several properties the generated tools must comply with to be usable in other domains than desktop-based software systems. We then challenge EMF on these properties and describe our approach to overcome the limitations. Our approach, implemented in the Kevoree Modeling Framework (KMF), is finally evaluated according to the identified properties and compared to EMF.Comment: ISBN 978-2-87971-131-7; N° TR-SnT-2014-11 (2014

    JavaScript: Bringing Object-Level Security to the Browser

    Get PDF
    JavaScript has evolved from a simple language intended to give web browsers basic hinteraction into a fully featured dynamic language that allows the browser to become an application delivery platform. With innovations such as asynchronous JavaScript and XML (AJAX) and JavaScript Object Notation (JSON), JavaScript has become the de facto standard for creating interactive web applications. With its new found power and popularity, JavaScript has been the target of many attacks. In this paper, we present a framework that allows programmers to define secure properties of JavaScript objects such that they are more immune to malicious activity and require a smaller footprint that existing solutions. We then use our framework and apply it to an already built JavaScript system to analyze its properties and effectiveness.unpublishednot peer reviewe

    Implementing a map based simulator for the location API for J2ME

    Get PDF
    The Java Location API for J2METM integrates generic positioning and orientation data with persistent storage of landmark objects. It can be used to develop location based service applications for small mobile devices, and these applications can be tested using simulation environments. Currently the only simulation tools in the public domain are proprietary mobile device simulators that are driven by GPS data log files, but it is sometimes useful to be able to test location based services using interactive map-based tools. In addition, we may need to experiment with extensions and changes to the standard API to support additional services, requiring an open source environment. In this paper we describe the implementation of an open source map-based simulation tool compatible with other commonly used development and deployment tools

    Code Generation for Efficient Query Processing in Managed Runtimes

    Get PDF
    In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1

    Efficient execution of ATL model transformations using static analysis and parallelism

    Get PDF
    Although model transformations are considered to be the heart and soul of Model Driven Engineering (MDE), there are still several challenges that need to be addressed to unleash their full potential in industrial settings. Among other shortcomings, their performance and scalability remain unsatisfactory for dealing with large models, making their wide adoption difficult in practice. This paper presents A2L, a compiler for the parallel execution of ATL model transformations, which produces efficient code that can use existing multicore computer architectures, and applies effective optimizations at the transformation level using static analysis. We have evaluated its performance in both sequential and multi-threaded modes obtaining significant speedups with respect to current ATL implementations. In particular, we obtain speedups between 2.32x and 38.28x for the A2L sequential version, and between 2.40x and 245.83x when A2L is executed in parallel, with expected average speedups of 8.59x and 22.42x, respectively.Spanish Research Projects PGC2018-094905-B-I00, TIN2015-73968-JIN (AEI/FEDER/UE), Ramón y Cajal 2017 research grant, TIN2016-75944-R. Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and by the FWF under the Grant Numbers P28519-N31 and P30525-N31

    Instance Space Analysis of Search-Based Software Testing

    Full text link
    Search-based software testing (SBST) is now a mature area, with numerous techniques developed to tackle the challenging task of software testing. SBST techniques have shown promising results and have been successfully applied in the industry to automatically generate test cases for large and complex software systems. Their effectiveness, however, is problem-dependent. In this paper, we revisit the problem of objective performance evaluation of SBST techniques considering recent methodological advances -- in the form of Instance Space Analysis (ISA) -- enabling the strengths and weaknesses of SBST techniques to be visualized and assessed across the broadest possible space of problem instances (software classes) from common benchmark datasets. We identify features of SBST problems that explain why a particular instance is hard for an SBST technique, reveal areas of hard and easy problems in the instance space of existing benchmark datasets, and identify the strengths and weaknesses of state-of-the-art SBST techniques. In addition, we examine the diversity and quality of common benchmark datasets used in experimental evaluations

    Code Specialization for Memory Efficient Hash Tries

    Get PDF
    The hash trie data structure is a common part in standard collection libraries of JVM programming languages such as Clojure and Scala. It enables fast immutable implementations of maps, sets, and vectors, but it requires considerably more memory than an equivalent array-based data structure. This hinders the scalability of functional programs and the further adoption of this otherwise attractive style of programming. In this paper we present a product family of hash tries. We generate Java source code to specialize them using knowledge of JVM object memory layout. The number of possible specializations is exponential. The optimization challenge is thus to find a minimal set of variants which lead to a maximal loss in memory footprint on any given data. Using a set of experiments we measured the distribution of internal tree node sizes in hash tries. We used the results as a guidance to decide which variants of the family to generate and which variants should be left to the generic implementation. A preliminary validating experiment on the implementation of sets and maps shows that this technique leads to a median decrease of 55% in memory footprint for maps (and 78% for sets), while still maintaining comparable performance. Our combination of data analysis and code specialization proved to be effective