32 research outputs found

    A Formal C Memory Model for Separation Logic

    Get PDF
    The core of a formal semantics of an imperative programming language is a memory model that describes the behavior of operations on the memory. Defining a memory model that matches the description of C in the C11 standard is challenging because C allows both high-level (by means of typed expressions) and low-level (by means of bit manipulation) memory accesses. The C11 standard has restricted the interaction between these two levels to make more effective compiler optimizations possible, on the expense of making the memory model complicated. We describe a formal memory model of the (non-concurrent part of the) C11 standard that incorporates these restrictions, and at the same time describes low-level memory operations. This formal memory model includes a rich permission model to make it usable in separation logic and supports reasoning about program transformations. The memory model and essential properties of it have been fully formalized using the Coq proof assistant

    Into the depths of C: Elaborating the de facto standards

    Get PDF
    C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.We acknowledge funding from EPSRC grants EP/H005633 (Leadership Fellowship, Sewell) and EP/K008528 (REMS Programme Grant), and a Gates Cambridge Scholarship (Nienhuis). This work is also part of the CTSRD projects sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-10-C-0237.This is the author accepted manuscript. The final version is available from the Association for Computing Machinery via http://dx.doi.org/10.1145/2908080.290808

    Stacked Borrows: {A}n Aliasing Model for {Rust}

    Get PDF

    Understanding and evolving the Rust programming language

    Get PDF
    Rust is a young systems programming language that aims to fill the gap between high-level languages—which provide strong static guarantees like memory and thread safety—and low-level languages—which give the programmer fine-grained control over data layout and memory management. This dissertation presents two projects establishing the first formal foundations for Rust, enabling us to better understand and evolve this important language: RustBelt and Stacked Borrows. RustBelt is a formal model of Rust’s type system, together with a soundness proof establishing memory and thread safety. The model is designed to verify the safety of a number of intricate APIs from the Rust standard library, despite the fact that the implementations of these APIs use unsafe language features. Stacked Borrows is a proposed extension of the Rust specification, which enables the compiler to use the strong aliasing information in Rust’s types to better analyze and optimize the code it is compiling. The adequacy of this specification is evaluated not only formally, but also by running real Rust code in an instrumented version of Rust’s Miri interpreter that implements the Stacked Borrows semantics. RustBelt is built on top of Iris, a language-agnostic framework, implemented in the Coq proof assistant, for building higher-order concurrent separation logics. This dissertation begins by giving an introduction to Iris, and explaining how Iris enables the derivation of complex high-level reasoning principles from a few simple ingredients. In RustBelt, this technique is exploited crucially to introduce the lifetime logic, which provides a novel separation-logic account of borrowing, a key distinguishing feature of the Rust type system.Rust ist eine junge systemnahe Programmiersprache, die es sich zum Ziel gesetzt hat, die Lücke zu schließen zwischen Sprachen mit hohem Abstraktionsniveau, die vor Speicher- und Nebenläufigkeitsfehlern schützen, und Sprachen mit niedrigem Abstraktionsniveau, welche dem Programmierer detaillierte Kontrolle über die Repräsentation von Daten und die Verwaltung des Speichers ermöglichen. Diese Dissertation stellt zwei Projekte vor, welche die ersten formalen Grundlagen für Rust zum Zwecke des besseren Verständnisses und der weiteren Entwicklung dieser wichtigen Sprache legen: RustBelt und Stacked Borrows. RustBelt ist ein formales Modell des Typsystems von Rust einschließlich eines Korrektheitsbeweises, welcher die Sicherheit von Speicherzugriffen und Nebenläufigkeit zeigt. Das Modell ist darauf ausgerichtet, einige komplexe Komponenten der Standardbibliothek von Rust zu verifizieren, obwohl die Implementierung dieser Komponenten unsichere Sprachkonstrukte verwendet. Stacked Borrows ist eine Erweiterung der Spezifikation von Rust, die es dem Compiler ermöglicht, den Quelltext mit Hilfe der im Typsystem kodierten Alias-Informationen besser zu analysieren und zu optimieren. Die Tauglichkeit dieser Spezifikation wird nicht nur formal belegt, sondern auch an echten Programmen getestet, und zwar mit Hilfe einer um Stacked Borrows erweiterten Version des Interpreters Miri. RustBelt basiert auf Iris, welches die Konstruktion von Separationslogiken für beliebige Programmiersprachen im Beweisassistenten Coq ermöglicht. Diese Dissertation beginnt mit einer Einführung in Iris und erklärt, wie komplexe Beweismethoden mit Hilfe weniger einfacher Bausteine hergeleitet werden können. In RustBelt wird diese Technik für die Umsetzung der „Lebenszeitlogik“ verwendet, einer Erweiterung der Separationslogik mit dem Konzept von „Leihgaben“ (borrows), welche eine wichtige Rolle im Typsystem von Rust spielen.This research was supported in part by a European Research Council (ERC) Consolidator Grant for the project "RustBelt", funded under the European Union’s Horizon 2020 Framework Programme (grant agreement no. 683289)

    Exploring C semantics and pointer provenance

    Get PDF
    The semantics of pointers and memory objects in C has been a vexed question for many years. C values cannot be treated as either purely abstract or purely concrete entities: the language exposes their representations, but compiler optimisations rely on analyses that reason about provenance and initialisation status, not just runtime representations. The ISO WG14 standard leaves much of this unclear, and in some respects differs with de facto standard usage - which itself is difficult to investigate. In this paper we explore the possible source-language semantics for memory objects and pointers, in ISO C and in C as it is used and implemented in practice, focussing especially on pointer provenance. We aim to, as far as possible, reconcile the ISO C standard, mainstream compiler behaviour, and the semantics relied on by the corpus of existing C code. We present two coherent proposals, tracking provenance via integers and not; both address many design questions. We highlight some pros and cons and open questions, and illustrate the discussion with a library of test cases. We make our semantics executable as a test oracle, integrating it with the Cerberus semantics for much of the rest of C, which we have made substantially more complete and robust, and equipped with a web-interface GUI. This allows us to experimentally assess our proposals on those test cases. To assess their viability with respect to larger bodies of C code, we analyse the changes required and the resulting behaviour for a port of FreeBSD to CHERI, a research architecture supporting hardware capabilities, which (roughly speaking) traps on the memory safety violations which our proposals deem undefined behaviour. We also develop a new runtime instrumentation tool to detect possible provenance violations in normal C code, and apply it to some of the SPEC benchmarks. We compare our proposal with a source-language variant of the twin-allocation LLVM semantics proposal of Lee et al. Finally, we describe ongoing interactions with WG14, exploring how our proposals could be incorporated into the ISO standard

    C의 저수준 기능과 컴파일러 최적화 조화시키기

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 허충길.주류 C 컴파일러들은 프로그램의 성능을 높이기 위해 공격적인 최적화를 수행하는데, 그런 최적화는 저수준 기능을 사용하는 프로그램의 행동을 바꾸기도 한다. 불행히도 C 언어를 디자인할 때 저수준 기능과 컴파일러 최적화를 적절하게 조화시키가 굉장히 어렵다는 것이 학계와 업계의 중론이다. 저수준 기능을 위해서는, 그러한 기능이 시스템 프로그래밍에 사용되는 패턴을 잘 지원해야 한다. 컴파일러 최적화를 위해서는, 주류 컴파일러가 수행하는 복잡하고도 효과적인 최적화를 잘 지원해야 한다. 그러나 저수준 기능과 컴파일러 최적화를 동시에 잘 지원하는 실행의미는 오늘날까지 제안된 바가 없다. 본 박사학위 논문은 시스템 프로그래밍에서 요긴하게 사용되는 저수준 기능과 주요한 컴파일러 최적화를 조화시킨다. 구체적으로, 우린 다음 성질을 만족하는 느슨한 동시성, 분할 컴파일, 정수-포인터 변환의 실행의미를 처음으로 제안한다. 첫째, 기능이 시스템 프로그래밍에서 사용되는 패턴과, 그러한 패턴을 논증할 수 있는 기법을 지원한다. 둘째, 주요한 컴파일러 최적화들을 지원한다. 우리가 제안한 실행의미에 자신감을 얻기 위해 우리는 논문의 주요 결과를 대부분 Coq 증명기 위에서 증명하고, 그 증명을 기계적이고 엄밀하게 확인했다.To improve the performance of C programs, mainstream compilers perform aggressive optimizations that may change the behaviors of programs that use low-level features in unidiomatic ways. Unfortunately, despite many years of research and industrial efforts, it has proven very difficult to adequately balance the conflicting criteria for low-level features and compiler optimizations in the design of the C programming language. On the one hand, C should support the common usage patterns of the low-level features in systems programming. On the other hand, C should also support the sophisticated and yet effective optimizations performed by mainstream compilers. None of the existing proposals for C semantics, however, sufficiently support low-level features and compiler optimizations at the same time. In this dissertation, we resolve the conflict between some of the low-level features crucially used in systems programming and major compiler optimizations. Specifically, we develop the first formal semantics of relaxed-memory concurrency, separate compilation, and cast between integers and pointers that (1) supports their common usage patterns and reasoning principles for programmers, and (2) provably validates major compiler optimizations at the same time. To establish confidence in our formal semantics, we have formalized most of our key results in the Coq theorem prover, which automatically and rigorously checks the validity of the results.Abstract Acknowledgements Chapter I Prologue Chapter II Relaxed-Memory Concurrency Chapter III Separate Compilation and Linking Chapter IV Cast between Integers and Pointers Chapter V Epilogue 초록Docto

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    A Verified CompCert Front-End for a Memory Model Supporting Pointer Arithmetic and Uninitialised Data

    Get PDF
    International audienceThe CompCert C compiler guarantees that the target program behaves as the source program. Yet, source programs without a defined semantics do not benefit from this guarantee and could therefore be miscompiled. To reduce the possibility of a miscompilation, we propose a novel memory model for CompCert which gives a defined semantics to challenging features such as bitwise pointer arithmetics and access to uninitialised data. We evaluate our memory model both theoretically and experimentally. In our experiments, we identify pervasive low-level C idioms that require the additional expressiveness provided by our memory model. We also show that our memory model provably subsumes the existing CompCert memory model thus cross-validating both semantics. Our memory model relies on the core concepts of symbolic value and normalisa-tion. A symbolic value models a delayed computation and the normalisation turns, when possible, a symbolic value into a genuine value. We show how to tame the expressive power of the normalisation so that the memory model fits the proof framework of CompCert. We also adapt the proofs of correctness of the compiler passes performed by CompCert's front-end, thus demonstrating that our model is well-suited for proving compiler transformations
    corecore