1,378 research outputs found

    Tupleware: Redefining Modern Analytics

    Full text link
    There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

    Understanding Optimization Phase Interactions to Reduce the Phase Order Search Space

    Get PDF
    Compiler optimization phase ordering is a longstanding problem, and is of particular relevance to the performance-oriented and cost-constrained domain of embedded systems applications. Optimization phases are known to interact with each other, enabling and disabling opportunities for successive phases. Therefore, varying the order of applying these phases often generates distinct output codes, with different speed, code-size and power consumption characteristics. Most cur- rent approaches to address this issue focus on developing innovative methods to selectively evaluate the vast phase order search space to produce a good (but, potentially suboptimal) representation for each program. In contrast, the goal of this thesis is to study and reduce the phase order search space by: (1) identifying common causes of optimization phase interactions across all phases, and then devising techniques to eliminate them, and (2) exploiting natural phase independence to prune the phase order search space. We observe that several phase interactions are caused by false register dependence during many optimization phases. We explore the potential of cleanup phases, such as register remapping and copy propagation, at reducing false dependences. We show that innovative implementation and application of these phases not only reduces the size of the phase order search space substantially, but can also improve the quality of code generated by optimizing compilers. We examine the effect of removing cleanup phases, such as dead assignment elimination, which should not interact with other compiler phases, from the phase order search space. Finally, we show that reorganization of the phase order search into a multi-staged approach employing sets of mutually independent optimizations can reduce the search space to a fraction of its original size without sacrificing performance

    Formal Compiler Implementation in a Logical Framework

    Get PDF
    The task of designing and implementing a compiler can be a difficult and error-prone process. In this paper, we present a new approach based on the use of higher-order abstract syntax and term rewriting in a logical framework. All program transformations, from parsing to code generation, are cleanly isolated and specified as term rewrites. This has several advantages. The correctness of the compiler depends solely on a small set of rewrite rules that are written in the language of formal mathematics. In addition, the logical framework guarantees the preservation of scoping, and it automates many frequently-occurring tasks including substitution and rewriting strategies. As we show, compiler development in a logical framework can be easier than in a general-purpose language like ML, in part because of automation, and also because the framework provides extensive support for examination, validation, and debugging of the compiler transformations. The paper is organized around a case study, using the MetaPRL logical framework to compile an ML-like language to Intel x86 assembly. We also present a scoped formalization of x86 assembly in which all registers are immutable

    Data optimizations for constraint automata

    Get PDF
    Constraint automata (CA) constitute a coordination model based on finite automata on infinite words. Originally introduced for modeling of coordinators, an interesting new application of CAs is implementing coordinators (i.e., compiling CAs into executable code). Such an approach guarantees correctness-by-construction and can even yield code that outperforms hand-crafted code. The extent to which these two potential advantages materialize depends on the smartness of CA-compilers and the existence of proofs of their correctness. Every transition in a CA is labeled by a "data constraint" that specifies an atomic data-flow between coordinated processes as a first-order formula. At run-time, compiler-generated code must handle data constraints as efficiently as possible. In this paper, we present, and prove the correctness of two optimization techniques for CA-compilers related to handling of data constraints: a reduction to eliminate redundant variables and a translation from (declarative) data constraints to (imperative) data commands expressed in a small sequential language. Through experiments, we show that these optimization techniques can have a positive impact on performance of generated executable code

    C์˜ ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ๊ณผ ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™” ์กฐํ™”์‹œํ‚ค๊ธฐ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ํ—ˆ์ถฉ๊ธธ.์ฃผ๋ฅ˜ C ์ปดํŒŒ์ผ๋Ÿฌ๋“ค์€ ํ”„๋กœ๊ทธ๋žจ์˜ ์„ฑ๋Šฅ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด ๊ณต๊ฒฉ์ ์ธ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š”๋ฐ, ๊ทธ๋Ÿฐ ์ตœ์ ํ™”๋Š” ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์˜ ํ–‰๋™์„ ๋ฐ”๊พธ๊ธฐ๋„ ํ•œ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ C ์–ธ์–ด๋ฅผ ๋””์ž์ธํ•  ๋•Œ ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ๊ณผ ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”๋ฅผ ์ ์ ˆํ•˜๊ฒŒ ์กฐํ™”์‹œํ‚ค๊ฐ€ ๊ต‰์žฅํžˆ ์–ด๋ ต๋‹ค๋Š” ๊ฒƒ์ด ํ•™๊ณ„์™€ ์—…๊ณ„์˜ ์ค‘๋ก ์ด๋‹ค. ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ์„ ์œ„ํ•ด์„œ๋Š”, ๊ทธ๋Ÿฌํ•œ ๊ธฐ๋Šฅ์ด ์‹œ์Šคํ…œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์— ์‚ฌ์šฉ๋˜๋Š” ํŒจํ„ด์„ ์ž˜ ์ง€์›ํ•ด์•ผ ํ•œ๋‹ค. ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”๋ฅผ ์œ„ํ•ด์„œ๋Š”, ์ฃผ๋ฅ˜ ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ณต์žกํ•˜๊ณ ๋„ ํšจ๊ณผ์ ์ธ ์ตœ์ ํ™”๋ฅผ ์ž˜ ์ง€์›ํ•ด์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ๊ณผ ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”๋ฅผ ๋™์‹œ์— ์ž˜ ์ง€์›ํ•˜๋Š” ์‹คํ–‰์˜๋ฏธ๋Š” ์˜ค๋Š˜๋‚ ๊นŒ์ง€ ์ œ์•ˆ๋œ ๋ฐ”๊ฐ€ ์—†๋‹ค. ๋ณธ ๋ฐ•์‚ฌํ•™์œ„ ๋…ผ๋ฌธ์€ ์‹œ์Šคํ…œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ ์š”๊ธดํ•˜๊ฒŒ ์‚ฌ์šฉ๋˜๋Š” ์ €์ˆ˜์ค€ ๊ธฐ๋Šฅ๊ณผ ์ฃผ์š”ํ•œ ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”๋ฅผ ์กฐํ™”์‹œํ‚จ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ์šฐ๋ฆฐ ๋‹ค์Œ ์„ฑ์งˆ์„ ๋งŒ์กฑํ•˜๋Š” ๋Š์Šจํ•œ ๋™์‹œ์„ฑ, ๋ถ„ํ•  ์ปดํŒŒ์ผ, ์ •์ˆ˜-ํฌ์ธํ„ฐ ๋ณ€ํ™˜์˜ ์‹คํ–‰์˜๋ฏธ๋ฅผ ์ฒ˜์Œ์œผ๋กœ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ์งธ, ๊ธฐ๋Šฅ์ด ์‹œ์Šคํ…œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ํŒจํ„ด๊ณผ, ๊ทธ๋Ÿฌํ•œ ํŒจํ„ด์„ ๋…ผ์ฆํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•์„ ์ง€์›ํ•œ๋‹ค. ๋‘˜์งธ, ์ฃผ์š”ํ•œ ์ปดํŒŒ์ผ๋Ÿฌ ์ตœ์ ํ™”๋“ค์„ ์ง€์›ํ•œ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์ œ์•ˆํ•œ ์‹คํ–‰์˜๋ฏธ์— ์ž์‹ ๊ฐ์„ ์–ป๊ธฐ ์œ„ํ•ด ์šฐ๋ฆฌ๋Š” ๋…ผ๋ฌธ์˜ ์ฃผ์š” ๊ฒฐ๊ณผ๋ฅผ ๋Œ€๋ถ€๋ถ„ Coq ์ฆ๋ช…๊ธฐ ์œ„์—์„œ ์ฆ๋ช…ํ•˜๊ณ , ๊ทธ ์ฆ๋ช…์„ ๊ธฐ๊ณ„์ ์ด๊ณ  ์—„๋ฐ€ํ•˜๊ฒŒ ํ™•์ธํ–ˆ๋‹ค.To improve the performance of C programs, mainstream compilers perform aggressive optimizations that may change the behaviors of programs that use low-level features in unidiomatic ways. Unfortunately, despite many years of research and industrial efforts, it has proven very difficult to adequately balance the conflicting criteria for low-level features and compiler optimizations in the design of the C programming language. On the one hand, C should support the common usage patterns of the low-level features in systems programming. On the other hand, C should also support the sophisticated and yet effective optimizations performed by mainstream compilers. None of the existing proposals for C semantics, however, sufficiently support low-level features and compiler optimizations at the same time. In this dissertation, we resolve the conflict between some of the low-level features crucially used in systems programming and major compiler optimizations. Specifically, we develop the first formal semantics of relaxed-memory concurrency, separate compilation, and cast between integers and pointers that (1) supports their common usage patterns and reasoning principles for programmers, and (2) provably validates major compiler optimizations at the same time. To establish confidence in our formal semantics, we have formalized most of our key results in the Coq theorem prover, which automatically and rigorously checks the validity of the results.Abstract Acknowledgements Chapter I Prologue Chapter II Relaxed-Memory Concurrency Chapter III Separate Compilation and Linking Chapter IV Cast between Integers and Pointers Chapter V Epilogue ์ดˆ๋กDocto

    SmartTrack: Efficient Predictive Race Detection

    Full text link
    Widely used data race detectors, including the state-of-the-art FastTrack algorithm, incur performance costs that are acceptable for regular in-house testing, but miss races detectable from the analyzed execution. Predictive analyses detect more data races in an analyzed execution than FastTrack detects, but at significantly higher performance cost. This paper presents SmartTrack, an algorithm that optimizes predictive race detection analyses, including two analyses from prior work and a new analysis introduced in this paper. SmartTrack's algorithm incorporates two main optimizations: (1) epoch and ownership optimizations from prior work, applied to predictive analysis for the first time; and (2) novel conflicting critical section optimizations introduced by this paper. Our evaluation shows that SmartTrack achieves performance competitive with FastTrack-a qualitative improvement in the state of the art for data race detection.Comment: Extended arXiv version of PLDI 2020 paper (adds Appendices A-E) #228 SmartTrack: Efficient Predictive Race Detectio
    • โ€ฆ
    corecore