86 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationAggressive random testing tools, or fuzzers, are impressively effective at finding bugs in compilers and programming language runtimes. For example, a single test-case generator has resulted in more than 460 bugs reported for a number of production-quality C compilers. However, fuzzers can be hard to use. The first problem is that failures triggered by random test cases can be difficult to debug because these tests are often large. To report a compiler bug, one must often construct a small test case that triggers the bug. The existing automated test-case reduction technique, delta debugging, is not sufficient to produce small, reportable test cases. A second problem is that fuzzers are indiscriminate: they repeatedly find bugs that may not be severe enough to fix right away. Third, fuzzers tend to generate a large number of test cases that only trigger a few bugs. Some bugs are triggered much more frequently than others, creating needle-in-the-haystack problems. Currently, users rule out undesirable test cases using ad hoc methods such as disallowing problematic features in tests and filtering test results. This dissertation investigates approaches to improving the utility of compiler fuzzers. Two components, an aggressive test-case reducer and a tamer, are added to the fuzzing workflow to make the fuzzer more user friendly. We introduce C-Reduce, an aggressive test-case reducer for C/C++ programs, which exploits rich domain-specific knowledge to output test cases nearly as good as those produced by skilled humans. This reducer produces outputs that are, on average, more than 30 times smaller than those produced by the existing reducer that is most commonly used by compiler engineers. Second, this dissertation formulates and addresses the fuzzer taming problem: given a potentially large number of random test cases that trigger failures, order them such that diverse, interesting test cases are highly ranked. Bug triage can be effectively automated, relying on techniques from machine learning to suppress duplicate bug-triggering test cases and test cases triggering known bugs. An evaluation shows the ability of this tool to solve the fuzzer taming problem for 3,799 test cases triggering 46 bugs in a C compiler

    Secure Compilation (Dagstuhl Seminar 18201)

    Get PDF
    Secure compilation is an emerging field that puts together advances in security, programming languages, verification, systems, and hardware architectures in order to devise secure compilation chains that eliminate many of today\u27s vulnerabilities. Secure compilation aims to protect a source language\u27s abstractions in compiled code, even against low-level attacks. For a concrete example, all modern languages provide a notion of structured control flow and an invoked procedure is expected to return to the right place. However, today\u27s compilation chains (compilers, linkers, loaders, runtime systems, hardware) cannot efficiently enforce this abstraction: linked low-level code can call and return to arbitrary instructions or smash the stack, blatantly violating the high-level abstraction. The emerging secure compilation community aims to address such problems by devising formal security criteria, efficient enforcement mechanisms, and effective proof techniques. This seminar strived to take a broad and inclusive view of secure compilation and to provide a forum for discussion on the topic. The goal was to identify interesting research directions and open challenges by bringing together people working on building secure compilation chains, on developing proof techniques and verification tools, and on designing security mechanisms

    K-LLVM: A Relatively Complete Semantics of LLVM IR

    Get PDF

    Fast and Precise Symbolic Analysis of Concurrency Bugs in Device Drivers

    Get PDF
    Β© 2015 IEEE.Concurrency errors, such as data races, make device drivers notoriously hard to develop and debug without automated tool support. We present Whoop, a new automated approach that statically analyzes drivers for data races. Whoop is empowered by symbolic pairwise lockset analysis, a novel analysis that can soundly detect all potential races in a driver. Our analysis avoids reasoning about thread interleavings and thus scales well. Exploiting the race-freedom guarantees provided by Whoop, we achieve a sound partial-order reduction that significantly accelerates Corral, an industrial-strength bug-finder for concurrent programs. Using the combination of Whoop and Corral, we analyzed 16 drivers from the Linux 4.0 kernel, achieving 1.5 - 20Γ— speedups over standalone Corral

    Doctor of Philosophy

    Get PDF
    dissertationCompilers are indispensable tools to developers. We expect them to be correct. However, compiler correctness is very hard to be reasoned about. This can be partly explained by the daunting complexity of compilers. In this dissertation, I will explain how we constructed a random program generator, Csmith, and used it to find hundreds of bugs in strong open source compilers such as the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). The success of Csmith depends on its ability of being expressive and unambiguous at the same time. Csmith is composed of a code generator and a GTAV (Generation-Time Analysis and Validation) engine. They work interactively to produce expressive yet unambiguous random programs. The expressiveness of Csmith is attributed to the code generator, while the unambiguity is assured by GTAV. GTAV performs program analyses, such as points-to analysis and effect analysis, efficiently to avoid ambiguities caused by undefined behaviors or unspecifed behaviors. During our 4.25 years of testing, Csmith has found over 450 bugs in the GNU Compiler Collection (GCC) and the LLVM Compiler Infrastructure (LLVM). We analyzed the bugs by putting them into different categories, studying the root causes, finding their locations in compilers' source code, and evaluating their importance. We believe analysis results are useful to future random testers, as well as compiler writers/users

    C의 μ €μˆ˜μ€€ κΈ°λŠ₯κ³Ό 컴파일러 μ΅œμ ν™” μ‘°ν™”μ‹œν‚€κΈ°

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀, 2019. 2. ν—ˆμΆ©κΈΈ.μ£Όλ₯˜ C μ»΄νŒŒμΌλŸ¬λ“€μ€ ν”„λ‘œκ·Έλž¨μ˜ μ„±λŠ₯을 높이기 μœ„ν•΄ 곡격적인 μ΅œμ ν™”λ₯Ό μˆ˜ν–‰ν•˜λŠ”λ°, 그런 μ΅œμ ν™”λŠ” μ €μˆ˜μ€€ κΈ°λŠ₯을 μ‚¬μš©ν•˜λŠ” ν”„λ‘œκ·Έλž¨μ˜ 행동을 바꾸기도 ν•œλ‹€. λΆˆν–‰νžˆλ„ C μ–Έμ–΄λ₯Ό λ””μžμΈν•  λ•Œ μ €μˆ˜μ€€ κΈ°λŠ₯κ³Ό 컴파일러 μ΅œμ ν™”λ₯Ό μ μ ˆν•˜κ²Œ μ‘°ν™”μ‹œν‚€κ°€ ꡉμž₯히 μ–΄λ ΅λ‹€λŠ” 것이 학계와 μ—…κ³„μ˜ 쀑둠이닀. μ €μˆ˜μ€€ κΈ°λŠ₯을 μœ„ν•΄μ„œλŠ”, κ·ΈλŸ¬ν•œ κΈ°λŠ₯이 μ‹œμŠ€ν…œ ν”„λ‘œκ·Έλž˜λ°μ— μ‚¬μš©λ˜λŠ” νŒ¨ν„΄μ„ 잘 지원해야 ν•œλ‹€. 컴파일러 μ΅œμ ν™”λ₯Ό μœ„ν•΄μ„œλŠ”, μ£Όλ₯˜ μ»΄νŒŒμΌλŸ¬κ°€ μˆ˜ν–‰ν•˜λŠ” λ³΅μž‘ν•˜κ³ λ„ 효과적인 μ΅œμ ν™”λ₯Ό 잘 지원해야 ν•œλ‹€. κ·ΈλŸ¬λ‚˜ μ €μˆ˜μ€€ κΈ°λŠ₯κ³Ό 컴파일러 μ΅œμ ν™”λ₯Ό λ™μ‹œμ— 잘 μ§€μ›ν•˜λŠ” μ‹€ν–‰μ˜λ―ΈλŠ” μ˜€λŠ˜λ‚ κΉŒμ§€ μ œμ•ˆλœ λ°”κ°€ μ—†λ‹€. λ³Έ λ°•μ‚¬ν•™μœ„ 논문은 μ‹œμŠ€ν…œ ν”„λ‘œκ·Έλž˜λ°μ—μ„œ μš”κΈ΄ν•˜κ²Œ μ‚¬μš©λ˜λŠ” μ €μˆ˜μ€€ κΈ°λŠ₯κ³Ό μ£Όμš”ν•œ 컴파일러 μ΅œμ ν™”λ₯Ό μ‘°ν™”μ‹œν‚¨λ‹€. ꡬ체적으둜, 우린 λ‹€μŒ μ„±μ§ˆμ„ λ§Œμ‘±ν•˜λŠ” λŠμŠ¨ν•œ λ™μ‹œμ„±, λΆ„ν•  컴파일, μ •μˆ˜-포인터 λ³€ν™˜μ˜ μ‹€ν–‰μ˜λ―Έλ₯Ό 처음으둜 μ œμ•ˆν•œλ‹€. 첫째, κΈ°λŠ₯이 μ‹œμŠ€ν…œ ν”„λ‘œκ·Έλž˜λ°μ—μ„œ μ‚¬μš©λ˜λŠ” νŒ¨ν„΄κ³Ό, κ·ΈλŸ¬ν•œ νŒ¨ν„΄μ„ 논증할 수 μžˆλŠ” 기법을 μ§€μ›ν•œλ‹€. λ‘˜μ§Έ, μ£Όμš”ν•œ 컴파일러 μ΅œμ ν™”λ“€μ„ μ§€μ›ν•œλ‹€. μš°λ¦¬κ°€ μ œμ•ˆν•œ μ‹€ν–‰μ˜λ―Έμ— μžμ‹ κ°μ„ μ–»κΈ° μœ„ν•΄ μš°λ¦¬λŠ” λ…Όλ¬Έμ˜ μ£Όμš” κ²°κ³Όλ₯Ό λŒ€λΆ€λΆ„ Coq 증λͺ…κΈ° μœ„μ—μ„œ 증λͺ…ν•˜κ³ , κ·Έ 증λͺ…을 기계적이고 μ—„λ°€ν•˜κ²Œ ν™•μΈν–ˆλ‹€.To improve the performance of C programs, mainstream compilers perform aggressive optimizations that may change the behaviors of programs that use low-level features in unidiomatic ways. Unfortunately, despite many years of research and industrial efforts, it has proven very difficult to adequately balance the conflicting criteria for low-level features and compiler optimizations in the design of the C programming language. On the one hand, C should support the common usage patterns of the low-level features in systems programming. On the other hand, C should also support the sophisticated and yet effective optimizations performed by mainstream compilers. None of the existing proposals for C semantics, however, sufficiently support low-level features and compiler optimizations at the same time. In this dissertation, we resolve the conflict between some of the low-level features crucially used in systems programming and major compiler optimizations. Specifically, we develop the first formal semantics of relaxed-memory concurrency, separate compilation, and cast between integers and pointers that (1) supports their common usage patterns and reasoning principles for programmers, and (2) provably validates major compiler optimizations at the same time. To establish confidence in our formal semantics, we have formalized most of our key results in the Coq theorem prover, which automatically and rigorously checks the validity of the results.Abstract Acknowledgements Chapter I Prologue Chapter II Relaxed-Memory Concurrency Chapter III Separate Compilation and Linking Chapter IV Cast between Integers and Pointers Chapter V Epilogue 초둝Docto
    • …
    corecore