37 research outputs found
Jitsu: Just-in-time summoning of unikernel
Network latency is a problem for all cloud services. It can be mitigated by moving computation out of remote datacenters by rapidly instantiating local services near the user. This requires an embedded cloud platform on which to deploy multiple applications securely and quickly. We present Jitsu, a new Xen toolstack that satisfies the demands of secure multi-tenant isolation on resource-constrained embedded ARM devices. It does this by using unikernels: lightweight, compact, single address space, memory-safe virtual machines (VMs) written in a high-level language. Using fast shared memory channels, Jitsu provides a directory service that launches unikernels in response to network traffic and masks boot latency. Our evaluation shows Jitsu to be a power-efficient and responsive platform for hosting cloud services in the edge network while preserving the strong isolation guarantees of a type-1 hypervisor.The research leading to these results received funding from the European Unionās Seventh Framework Programme FP7/2007ā2013 under the Trilogy 2 project (grant agreement no. 317756), and the User Centric Networking project, (grant agreement no. 611001), and the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-11-C-0249.This is the author accepted manuscript. The final version is available from USENIX via https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapedd
Recommended from our members
CheriOS: Designing an untrusted single-address-space capability operating system utilising capability hardware and a minimal hypervisor
This thesis presents the design, implementation, and evaluation of a novel capability operating system: CheriOS. The guiding motivation behind CheriOS is to provide strong security guarantees to programmers, even allowing them to continue to program in fast, but typically unsafe, languages such as C. Furthermore, it does this in the presence of an extremely strong adversarial model: in CheriOS, every compartment -- and even the operating system itself -- is considered actively malicious. Building on top of the architecturally enforced capabilities offered by the CHERI microprocessor, I show that only a few more capability types and enforcement checks are required to provide a strong compartmentalisation model that can facilitate mutual distrust. I implement these new primitives in software, in a new abstraction layer I dub the nanokernel. Among the new OS primitives I introduce are one for integrity and confidentiality called a Reservation (which allows allocating private memory without trusting the allocator), as well as another that can provide attestation about the state of the system, a Foundation (which provides a key to sign and protect capabilities based on a signature of the starting state of a program). I show that, using these new facilities, it is possible to design an operating system without having to trust the implementation is correct.
CheriOS is fundamentally fail-safe; there are no assumptions about the behaviour of the system, apart from the CHERI processor and the nanokernel, to be broken. Using CHERI and the new nanokernel primitives, programmers can expect full isolation at scopes ranging from a whole program to a single function, and not just with respect to other programs but the system itself. Programs compiled for and run on CheriOS offer full memory safety, both spatial and temporal, enforced control flow integrity between compartments and protection against common vulnerabilities such as buffer overflows, code injection and Return-Oriented-Programming attacks. I achieve this by designing a new CHERI-based ABI (Application Binary Interface) which includes a novel stack structure that offers temporal safety. I evaluate how practical the new designs are by prototyping them and offering a detailed performance evaluation. I also contrast with existing offerings from both industry and academia.
CHERI capabilities can be used to restrict access to system resources, such as memory, with the required dynamic checks being performed by hardware in parallel with normal operation. Using the accelerating features of CHERI, I show that many of the security guarantees that CheriOS offers can come at little to no cost. I present a novel and secure IO/IPC layer that allows secure marshalling of multiple data streams through mutually distrusting compartments, with fine-grained authenticated access control for end-points, and without either copying or encryption. For example, CheriOS can restrict its TCP stack from having access to packet contents, or restrict an open socket to ensure data sent on it to arrives at an endpoint signed as a TLS implementation. Even with added security requirements, CheriOS can perform well on real workloads. I showcase this by running a state-of-the-art webserver, NGINX, atop both CheriOS and FreeBSD and show improvements in performance ranging from 3x to 6x when running on a small-scale low-power FPGA implementation of CHERI-MIPS
Exploring C semantics and pointer provenance
The semantics of pointers and memory objects in C has been a vexed question for many years. C values cannot be treated as either purely abstract or purely concrete entities: the language exposes their representations, but compiler optimisations rely on analyses that reason about provenance and initialisation status, not just runtime representations. The ISO WG14 standard leaves much of this unclear, and in some respects differs with de facto standard usage - which itself is difficult to investigate.
In this paper we explore the possible source-language semantics for memory objects and pointers, in ISO C and in C as it is used and implemented in practice, focussing especially on pointer provenance. We aim to, as far as possible, reconcile the ISO C standard, mainstream compiler behaviour, and the semantics relied on by the corpus of existing C code. We present two coherent proposals, tracking provenance via integers and not; both address many design questions. We highlight some pros and cons and open questions, and illustrate the discussion with a library of test cases. We make our semantics executable as a test oracle, integrating it with the Cerberus semantics for much of the rest of C, which we have made substantially more complete and robust, and equipped with a web-interface GUI. This allows us to experimentally assess our proposals on those test cases. To assess their viability with respect to larger bodies of C code, we analyse the changes required and the resulting behaviour for a port of FreeBSD to CHERI, a research architecture supporting hardware capabilities, which (roughly speaking) traps on the memory safety violations which our proposals deem undefined behaviour. We also develop a new runtime instrumentation tool to detect possible provenance violations in normal C code, and apply it to some of the SPEC benchmarks. We compare our proposal with a source-language variant of the twin-allocation LLVM semantics proposal of Lee et al. Finally, we describe ongoing interactions with WG14, exploring how our proposals could be incorporated into the ISO standard
Hydra: loosely coupling the graphics pipeline to facilitate digital preservation.
It can be argued that software can be seen as a form of art and digital heritage and yet it rarely enjoys the same efforts afforded to it compared to physical counterparts. There are many reasons for this, such as the increasing costs of maintenance or the reducing amount of expertise in the specific aging technology. Maintaining software and ensuring that it continues to work on current hardware and operating systems is known as digital preservation. There are many ways in which we can attempt to preserve digital software and one of the most effective ones is by using emulation to simulate the obsolete hardware. However, for games and other entertainment media, this technique is not always effective due to a requirement on specific hardware, such as an accelerated GPU in order to reach an acceptable performance for the user. It is often difficult to emulate a GPU and, as such, a different approach often needs to be taken, which reduces the flexibility and portability of the emulation software. Hydra is a new approach to accessing the native hardware from within an emulated environment which allows for a much simpler emulator to be developed and maintained and yet also offers the potential of accessing other types of hardware without needing to modify the emulation software itself. Hydra is designed to be platform agnostic in that not only is it possible to integrate with existing emulators but also be immediately usable from within guest operating systems, ranging from legacy platforms such as MS-DOS, through to modern platforms such as the PlayStation 4 (Orbis OS, a FreeBSD derivative), through to more exotic platforms such as Plan 9 from Bell Laboratories. It can do this because it does not rely on a complex emulator-specific virtual driver stack. This PhD thesis provides the research undertaken for Hydra, including the motivation behind it, the specific problems it was designed to solve and how it can be implemented in a platform agnostic manner. Hydraās performance is analysed to ascertain the suitability of the output to cater for, specifically, a wide variety of platforms that it can run on in a satisfactory manner within less powerful or emulated environments. A performance analysis study is conducted to ensure that the technology provides an acceptable solution to accessing preserved titles. This study concluded with results showing that Hydra offers a greater performance than software rendering, especially within emulated environments. A bandwidth comparison between Hydra and VNC was undertaken to ascertain the use of the technology as a streaming medium. The results concluded that under specific conditions, Hydra performed better than VNC by streaming at a higher resolution and consuming less bandwidth. Hydra is also utilised in a number of engineering tasks relating to preservation of software. The experiences of using Hydra in this way are discussed, including any difficulties encountered. Lastly, a conclusion is made and any future work is identified
Efficient and predictable high-speed storage access for real-time embedded systems
As the speed, size, reliability and power efficiency of non-volatile storage media increases, and the data demands of many application domains grow, operating systems are being put under escalating pressure to provide high-speed access to storage. Traditional models of storage access assume devices to be slow, expecting plenty of slack time in which to process data between requests being serviced, and that all significant variations in timing will be down to the storage device itself. Modern high-speed storage devices break this assumption, causing storage applications to become processor-bound, rather than I/O-bound, in an increasing number of situations. This is especially an issue in real-time embedded systems, where limited processing resources and strict timing and predictability requirements amplify any issues caused by the complexity of the software storage stack.
This thesis explores the issues related to accessing high-speed storage from real-time embedded systems, providing a thorough analysis of storage operations based on metrics relevant to the area. From this analysis, a number of alternative storage architectures are proposed and explored, showing that a simpler, more direct path from applications to storage can have a positive impact on efficiency and predictability in such systems
Recommended from our members
Engineering with logic: Rigorous test-oracle specification and validation for TCP/IP and the Sockets API
Conventional computer engineering relies on test-and-debug development processes, with the behavior of common interfaces described (at best) with prose specification documents. But prose specifications cannot be used in test-and-debug development in any automated way, and prose is a poor medium for expressing complex (and loose) specifications.
The TCP/IP protocols and Sockets API are a good example of this: they play a vital role in modern communication and computation, and interoperability between implementations is essential. But what exactly they are is surprisingly obscure: their original development focused on ārough consensus and running code,ā augmented by prose RFC specifications that do not precisely define what it means for an implementation to be correct. Ultimately, the actual standard is the de facto one of the common implementations, including, for example, the 15Ā 000 to 20Ā 000 lines of the BSD implementationāoptimized and multithreaded C code, time dependent, with asynchronous event handlers, intertwined with the operating system, and security critical.
This article reports on work done in the
Netsem
project to develop lightweight mathematically rigorous techniques that can be applied to such systems: to specify their behavior precisely (but loosely enough to permit the required implementation variation) and to test whether these specifications and the implementations correspond with specifications that are
executable as test oracles
. We developed post hoc specifications of TCP, UDP, and the Sockets API, both of the service that they provide to applications (in terms of TCP bidirectional stream connections) and of the internal operation of the protocol (in terms of TCP segments and UDP datagrams), together with a testable abstraction function relating the two. These specifications are rigorous, detailed, readable, with broad coverage, and rather accurate. Working within a general-purpose proof assistant (HOL4), we developed
language idioms
(within higher-order logic) in which to write the specifications: operational semantics with nondeterminism, time, system calls, monadic relational programming, and so forth. We followed an
experimental semantics
approach, validating the specifications against several thousand traces captured from three implementations (FreeBSD, Linux, and WinXP). Many differences between these were identified, as were a number of bugs. Validation was done using a special-purpose
symbolic model checker
programmed above HOL4.
Having demonstrated that our logic-based engineering techniques suffice for handling real-world protocols, we argue that similar techniques could be applied to future critical software infrastructure at design time, leading to cleaner designs and (via specification-based testing) more robust and predictable implementations. In cases where specification looseness can be controlled, this should be possible with lightweight techniques, without the need for a general-purpose proof assistant, at relatively little cost.EPSRC Programme Grant EP/K008528/1 REMS: Rigorous Engineering for Mainstream Systems
EPSRC Leadership Fellowship EP/H005633 (Sewell)
Royal Society University Research Fellowship (Sewell)
St Catharine's College Heller Research Fellowship (Wansbrough),
EPSRC grant GR/N24872 Wide-area programming: Language, Semantics and Infrastructure Design
EPSRC grant EP/C510712 NETSEM: Rigorous Semantics for Real
Systems
EC FET-GC project IST-2001-33234 PEPITO Peer-to-Peer Computing: Implementation and Theory
CMI UROP internship support (Smith)
EC Thematic Network IST-2001-38957 APPSEM 2
NICTA was funded by the Australian Government's Backing Australia's Ability initiative, in part through the Australian Research Council
Analyzing Metadata Performance in Distributed File Systems
Distributed file systems are important building blocks in modern computing environments. The challenge of increasing I/O bandwidth to files has been largely resolved by the use of parallel file systems and sufficient hardware. However, determining the best means by which to manage large amounts of metadata, which contains information about files and directories stored in a distributed file system, has proved a more difficult challenge. The objective of this thesis is to analyze the role of metadata and present past and current implementations and access semantics. Understanding the development of the current file system interfaces and functionality is a key to understanding their performance limitations. Based on this analysis, a distributed metadata benchmark termed DMetabench is presented. DMetabench significantly improves on existing benchmarks and allows stress on metadata operations in a distributed file system in a parallelized manner. Both intranode and inter-node parallelity, current trends in computer architecture, can be explicitly tested with DMetabench. This is due to the fact that a distributed file system can have different semantics inside a client node rather than semantics between multiple nodes. As measurements in larger distributed environments may exhibit performance artifacts difficult to explain by reference to average numbers, DMetabench uses a time-logging technique to record time-related changes in the performance of metadata operations and also protocols additional details of the runtime environment for post-benchmark analysis. Using the large production file systems at the Leibniz Supercomputing Center (LRZ) in Munich, the functionality of DMetabench is evaluated by means of measurements on different distributed file systems. The results not only demonstrate the effectiveness of the methods proposed but also provide unique insight into the current state of metadata performance in modern file systems
Proxy compilation for Java via a code migration technique
There is an increasing trend that intermediate representations (IRs) are used to deliver programs in more and more languages, such as Java. Although Java can provide many advantages, including a wider portability and better optimisation opportunities on execution, it introduces extra overhead by requiring an IR translation for the program execution. For maximum execution performance, an optimising compiler is placed in the runtime to selectively optimise code regions regarded as āhotspotsā. This common approach has been effectively deployed in many implementation of programming languages. However, the computational resources demanded by this approach made it less efficient, or even difficult to deploy directly in a resourceconstrained environment. One implementation approach is to use a remote compilation technique to support compilation during the execution. The work presented in this dissertation supports the thesis that execution performance can be improved by the use of efficient optimising compilation by using a proxy dynamic optimising compiler. After surveying various approaches to the design and implementation of remote compilation, a proxy compilation system called Apus is defined. To demonstrate the effectiveness of using a dynamic optimising compiler as a proxy compiler, a complete proxy compilation system is written based on a research-oriented Java VirtualMachine (JVM). The proxy compilation system is discussed in detail, showing how to deliver remote binaries and manage a cache of binaries by using a code migration approach. The proxy compilation client shows how the proxy compilation service is integrated with the selective optimisation system to maximise execution performance. The results of empirical measurements of the system are given, showing the efficiency of code optimisation from either the proxy compilation service and a local binary cache. The conclusion of this work is that Java execution performance can be improved by efficient optimising compilation with a proxy compilation service by using a code migration technique
Remote fidelity of Container-Based Network Emulators
This thesis examines if Container-Based Network Emulators (CBNEs) are able to instantiate emulated nodes that provide sufficient realism to be used in information security experiments. The realism measure used is based on the information available from the point of view of a remote attacker. During the evaluation of a Container-Based Network Emulator (CBNE) as a platform to replicate production networks for information security experiments, it was observed that nmap fingerprinting returned Operating System (OS) family and version results inconsistent with that of the host Operating System (OS). CBNEs utilise Linux namespaces, the technology used for containerisation, to instantiate \emulated" hosts for experimental networks. Linux containers partition resources of the host OS to create lightweight virtual machines that share a single OS kernel. As all emulated hosts share the same kernel in a CBNE network, there is a reasonable expectation that the fingerprints of the host OS and emulated hosts should be the same. Based on how CBNEs instantiate emulated networks and that fingerprinting returned inconsistent results, it was hypothesised that the technologies used to construct CBNEs are capable of influencing fingerprints generated by utilities such as nmap. It was predicted that hosts emulated using different CBNEs would show deviations in remotely generated fingerprints when compared to fingerprints generated for the host OS. An experimental network consisting of two emulated hosts and a Layer 2 switch was instantiated on multiple CBNEs using the same host OS. Active and passive fingerprinting was conducted between the emulated hosts to generate fingerprints and OS family and version matches. Passive fingerprinting failed to produce OS family and version matches as the fingerprint databases for these utilities are no longer maintained. For active fingerprinting the OS family results were consistent between tested systems and the host OS, though OS version results reported was inconsistent. A comparison of the generated fingerprints revealed that for certain CBNEs fingerprint features related to network stack optimisations of the host OS deviated from other CBNEs and the host OS. The hypothesis that CBNEs can influence remotely generated fingerprints was partially confirmed. One CBNE system modified Linux kernel networking options, causing a deviation from fingerprints generated for other tested systems and the host OS. The hypothesis was also partially rejected as the technologies used by CBNEs do not influence the remote fidelity of emulated hosts.Thesis (MSc) -- Faculty of Science, Computer Science, 202