Search CORE

16 research outputs found

Additional Material for "Unifying Data Representation Transformations"

Author: Ureche Vlad
Publication venue
Publication date: 15/07/2014
Field of study

This report shows an attempt to formalize the data representation transformation mechanism in the ``Unifying Data Representation Transformations'' paper. Since the mechanism described in the paper is targeted at the Scala programming language and the specification is written against System Fsub with local colored type inference formally reasoning about the calculus is a major undertaking. Instead, in this report we start from the simply typed lambda calculus with subtyping, natural numbers and unit. We add rewriting and adapt the calculus to propagate expected type information in a mechanism inspired from local colored type inference. Finally we show how the representation transformation mechanism (the convert phase) rewrites terms. We show that, given a series of assumptions about the inject phase, type-checking a term against the updated rules produces a correct and operationally equivalent term, with a minimum number of runtime coercions introduced for the annotations given. We finish the report by giving a series of examples which show how the code is transformed

Infoscience - École polytechnique fédérale de Lausanne

ScalaDyno: Making Name Resolution and Type Checking Fault-tolerant

Author: Bastin Cédric
Odersky Martin
Ureche Vlad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

The ScalaDyno compiler plugin allows fast prototyping with the Scala programming language, in a way that combines the benefits of both statically and dynamically typed languages. Static name resolution and type checking prevent partially-correct code from being compiled and executed. Yet, allowing programmers to test critical paths in a program without worrying about the consistency of the entire code base is crucial to fast prototyping and agile development. This is where ScalaDyno comes in: it allows partially-correct programs to be compiled and executed, while shifting compile-time errors to program runtime. The key insight in ScalaDyno is that name and type errors affect limited areas of the code, which can be replaced by instructions to output the respective errors at runtime. This allows byte code generation and execution for partially correct programs, thus allowing Python or JavaScript-like fast prototyping in Scala. This is all done without sacrificing name resolution, full type checking and optimizations for the correct parts of the code -- they are still performed, but without getting in the way of agile development. Finally, for release code or sensitive refactoring, runtime errors can be disabled, thus allowing full static name resolution and type checking typical of the Scala compiler

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Late Data Layout: Unifying Data Representation Transformations

Author: Burmako Eugene
Odersky Martin
Ureche Vlad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/08/2014
Field of study

Values need to be represented differently when interacting with certain language features. For example, an integer has to take an object-based representation when interacting with erased generics, although, for performance reasons, the stack-based value representation is better. To abstract over these implementation details, some programming languages choose to expose a unified high-level concept (the integer) and let the compiler choose its exact representation and insert coercions where necessary. This pattern appears in multiple language features such as value classes, specialization and multi-stage programming: they all expose a unified concept which they later refine into multiple representations. Yet, the underlying compiler implementations typically entangle the core mechanism with assumptions about the alternative representations and their interaction with other language features. In this paper we present the Late Data Layout mechanism, a simple but versatile type-driven generalization that subsumes and improves the state-of-the-art representation transformations. In doing so, we make two key observations: (1) annotated types conveniently capture the semantics of using multiple representations and (2) local type inference can be used to consistently and optimally introduce coercions. We validated our approach by implementing three language features as Scala compiler extensions: value classes, specialization (using the miniboxing representation) and a simplified multi-stage programming mechanism

Infoscience - École polytechnique fédérale de Lausanne

Improving the Performance of Scala Collections with Miniboxing

Author: Genêt Aymeric
Odersky Martin
Ureche Vlad
Publication venue
Publication date: 15/07/2014
Field of study

Using generics, Scala collections can be used to store different types of data in a type-safe manner. Unfortunately, due to the erasure transformation, the performance of generics is degraded when storing primitive types, such as integers and floating point numbers. Miniboxing is a novel translation for generics that restores primitive type performance. Naturally, a good choice would be to use miniboxing to translate Scala collections. In this paper we explore the patterns used to implement the Scala collections, describe how they are transformed by miniboxing and finally compare the performance of the two transformations on a mockup of the Scala collection library. The benchmarks show our prototype implementation (http://scala-miniboxing.org) can speed up collection operations by 45% without any need for programmer intervention

Infoscience - École polytechnique fédérale de Lausanne

Parallel symbolic execution for automated real-world software testing

Author: Bucur Stefan
Candea George
Ureche Vlad
Zamfir Cristian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

This paper introduces Cloud9, a platform for automated testing of real-world software. Our main contribution is the scalable parallelization of symbolic execution on clusters of commodity hardware, to help cope with path explosion. Cloud9 provides a systematic interface for writing "symbolic tests" that concisely specify entire families of inputs and behaviors to be tested, thus improving testing productivity. Cloud9 can handle not only single-threaded programs but also multi-threaded and distributed systems. It includes a new symbolic environment model that is the first to support all major aspects of the POSIX interface, such as processes, threads, synchronization, networking, IPC, and file I/O. We show that Cloud9 can automatically test real systems, like memcached, Apache httpd, lighttpd, the Python interpreter, rsync, and curl. We show how Cloud9 can use existing test suites to generate new test cases that capture untested corner cases (e.g., network stream fragmentation). Cloud9 can also diagnose incomplete bug fixes by analyzing the difference between buggy paths before and after a patch

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Improving the Interoperation between Generics Translations

Author: Beguet Romain Michel
Odersky Martin
Stojanovic Milos
Stucki Nicolas Alexander
Ureche Vlad
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 04/06/2015
Field of study

Generics on the Java platform are compiled using the erasure transformation, which only supports by-reference values. This causes slowdowns when generics operate on primitive types, such as integers, as they have to be transformed into reference-based objects. Project Valhalla is an effort to remedy this problem by specializing classes at load-time so they can efficiently handle primitive values. In its current early prototype, the Valhalla compilation scheme limits the interaction between specialized and erased generics, thus preventing certain useful code patterns from being expressed. Scala has been using compile-time specialization for 6 years and has three generics compilation schemes working side by side. In Scala, programmers are allowed to write code that freely exercises the interaction between the different compilation schemes, at the expense of introducing subtle performance issues. Similar performance issues can affect Valhalla-enabled bytecode, whether the code was written in Java or translated from other JVM languages. In this context we explain how we help programmers avoid these performance regressions in the miniboxing transformation: (1) by issuing actionable performance advisories that steer programmers away from performance regressions and (2) by providing alternatives to the standard library constructs that use the miniboxing encoding, thus avoiding the conversion overhead

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Call Graphs for Languages with Parametric Polymorphism

Author: Lhot\'{a}k Ond\v{r}ej
Odersky Martin
Petrashko Dmytro
Ureche Vlad
Publication venue
Publication date: 10/03/2016
Field of study

The performance of contemporary object oriented languages depends on optimizations such as devirtualization, inlining, and specialization, and these in turn depend on precise call graph analysis. Existing call graph analyses do not take advantage of the information provided by the rich type systems of contemporary languages, in particular generic type arguments. Many existing approaches analyze Java bytecode, in which generic types have been erased. This paper shows that this discarded information is actually very useful as the context in a context-sensitive analysis, where it significantly improves precision and keeps the running time small. Specifically, we propose and evaluate call graph construction algorithms in which the contexts of a method are (i) the type arguments passed to its type parameters, and (ii) the static types of the arguments passed to its term parameters. The use of static types from the caller as context is effective because it allows more precise dispatch of call sites inside the callee. Our evaluation indicates that the average number of contexts required per method is small. We implement the analysis in the Dotty compiler for Scala, and evaluate it on programs that use the type-parametric Scala collections library and on the Dotty compiler itself. The context-sensitive analysis runs 1.4x faster than a context-insensitive one and discovers 20\% more monomorphic call sites at the same time. When applied to method specialization, the imprecision in a context-insensitive call graph would require the average method to be cloned 22 times, whereas the context-sensitive call graph indicates a much more practical 1.00 to 1.50 clones per method

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Bridging Islands of Specialized Code using Macros and Reified Types

Author: Nicolas Stucki
Vlad Ureche
Publication venue
Publication date: 11/08/2013
Field of study

Parametric polymorphism in Scala suffers from the usual drawback of erasure on the Java Virtual Machine: primitive values are boxed, leading to indirect access, wasteful use of heap memory and lack of cache locality. For performancecritical parts of the code, the Scala compiler introduces specialization, a transformation that duplicates and adapts the bodies of classes and methods for primitive types. Specializing code can speed up execution by an order of magnitude, but only if the code is called from monomorphic sites or from other specialized code. Still, if these “islands ” of specialized code are called from generic code, their performance becomes similar to that of generic code, losing optimality. To address this, our project builds high performance “bridges” between “islands ” of specialized code, removing the requirement that full traces need to be specialized: We use macros to delimit performance-critical “gaps ” between specialized code, which we also specialize. We then use reified types to dispatch the correct specialized variant, thus recovering performance across the “islands”. Our transformation 1 obtains speedups up to 30x and around 12x in average compared to generic only code, by enabling specialization to completely remove boxing and reach its full potential

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Photochemical Reactions of Naproxen, Ibuprofen and Tylosin

Author: Arvind Sujeeth
Hassan Chafi
Martin Odersky
Tiark Rompf
Vlad Ureche
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2012
Field of study

Pharmaceuticals and personal care products (PPCPs) include a wide range of compounds that are used extensively and sometimes daily by people. Some PPCPs have been detected in surface water (streams, rivers, lakes) due to incomplete removal in wastewater treatment plants. The water contaminated by PPCPs is harmful to aquatic organisms and human. Naproxen (NXP), ibuprofen (IBP) and tylosin (TYL) are chosen as representative PPCPs in the current research, because they are consumed in large quantities throughout the world and there is limited data about photodegradation of these compounds in aqueous solution at the wavelength of 254 nm. The combination of ultraviolet light (UV254nm ) and hydrogen peroxide (H2 O2 ) (UV/H2 O2 ) degraded greater than 90% of the initial concentration of NXP and IBP within 3 min (k = 0.018 sec-1 , k = 0.023 sec-1 for NXP and IBP, respectively). Under direct photolysis (UV254nm ) and at pH = 7, 20 min of treatment was required to obtain 90%degradation (k = 0.0028 sec-1 for NXP, k = 0.0023 sec-1 for IBP). Under the same conditions, molar absorptivity and quantum yield of each compound were determined (for NXP, = 4240 M-1 cm-1 and = 0.008; for IBP, =299 M-1 cm-1 and = 0.098). Overall, degradation rate constants increased with increasing initial H 2 O2 level (0 mM, 1mM and 3 mM) and increasing pH values (at pH=3, k = 0.0016 sec-1 for NXP and k = 0.0015 sec-1 for IBP; at pH =9, k = 0.0036 sec-1 for NXP and k = 0.0029 sec-1 for IBP). The presence of nitrate increased the photolysis rate constants of both NXP and IBP slightly due to hydroxyl radical formation from irradiation of nitrate. The rate constants were decreased because of screening light effect from the addition of natural organic matter (NOM): the rate constants were reduced by 18% and 36%for NXP and by 30% and 46% for IBP degradation with fulvic acid (FA) and humic acid (HA), respectively. To understand the mechanism of degradation under the UV254nm /H 2 O2 with NOM, a model was constructed to predict the phototransformation rate constants of NXP and IBP. From the model results, it could be seen that there was a concentration of H2 O2corresponding to the maximum enhancement of photolysis of select PPCPs. The mineralization of NXP and IBP was 30% and 32%, respectively. The degradation behaviors of TYL under UV254nm and UV 254nm /H2 O2 were quite different from the degradation of NXP and IBP. TYL was present as a mixture of two compounds: TYLA and TYLB. Photoisomerization and photodegradation proceeded at the same time, and photoisomerization reactions predominated. A kinetic model was constructed for determining the kinetic data. Under UV254nm condition and at pH = 7, for TYLA, kf = 0.066 sec-1 kr = 0.016 sec-1 kd = 0.00057 sec-1 , and for TYLB, rate constant for forward reaction kf = 0.067 sec-1 , rate constant for backward reaction kr = 0.022 sec-1 and degradation rate constant kd = 0.00040 sec -1 . Solution pH values and the presence of nitrate and NOM did not have any significant influences on the direct photolysis (UV254nm ) of TYL. Also at pH =7, the addition of H2 O2 did not dramatically affect the photoisomerization reaction, but accelerated the photodegradation of TYL. Selected major photochemical reaction by-products were identified by Gas Chromatography/Mass Spectroscopy (GC/MS) and Liquid Chromatography/Mass Spectroscopy (LC/MS). For both UV254nm and UV254nm /H 2 O2 conditions, the first step of NXP and IBP photodegradtion is decarboxylation, then the intermediates were oxidized to ketone and other products. Possible pathways of NXP and IBP degradation are proposed. For TYL, photoisomerization results from the / rotation of bond of the ketodiene on the TYL ring

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Purdue E-Pubs

Automating Ad hoc Data Representation Transformations

Author: Biboudis Aggelos
Odersky Martin
Smaragdakis Yannis
Ureche Vlad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/04/2015
Field of study

To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code. We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code. Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics. We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x

Infoscience - École polytechnique fédérale de Lausanne