30 research outputs found
Operating System Kernels on Multi-core Architectures
Operating System (OS) kernels have been under research and development for decades, mainly assuming single processor and distributed hardware systems.
With the recent rise of multi-core chips that may incorporate a network on chip (NoC), new challenges have appeared that were not considered before.
Given that a complete multi-core system that works on a single system on chip (SoC) is now the normal case, different cores on a single SoC may
share other physical resources and data. This new sharing scheme on a SoC affects crucial aspects of an overall system like correctness, performance,
predictability, scalability and security. Both hardware and OSs to flexibly cooperate in order to provide
solutions for such challenges.
SoC mimics the internet somehow now, with different cores acting as computer nodes, and the network medium is given in an advanced digital fabrics like buses or NoCs, that are
a current research area. However, OSs are still assuming some (hardware) features like single physical memory and memory sharing for inter-process communication, page-based protection, cache operations, even when evolving from uniprocessor to multi-core processors.
Such features not only may degrade performance and other system aspects, but also
some of them make no sense for a multi-core SoC, and introduce some barriers and limitations. While new OS research is considering different kernel designs
to cope up with multi-core systems, they are still limited by the current commercial hardware architectures.
The objective of this thesis is to assess different kernel designs and implementations on multi-core hardware architectures.
Part of the contributions of the thesis is porting
RTEMS (RTOS) and seL4 microkernel to Epiphany and RISC-V hardware architectures respectively, trading-off the design and implementation decisions. This hands-on experience gave a better understanding of the real-world challenges regarding kernel designs and implementations
Secure Virtualization of Latency-Constrained Systems
Virtualization is a mature technology in server and desktop environments where multiple systems are consolidate onto a single physical hardware platform, increasing the utilization of todays multi-core systems as well as saving resources such as energy, space and costs compared to multiple single systems. Looking at embedded environments reveals that many systems use multiple separate computing systems inside, including requirements for real-time and isolation properties. For example, modern high-comfort cars use up to a hundred embedded computing systems. Consolidating such diverse configurations promises to save resources such as energy and weight.
In my work I propose a secure software architecture that allows consolidating multiple embedded software systems with timing constraints. The base of the architecture builds a microkernel-based operating system that supports a variety of different virtualization approaches through a generic interface, supporting hardware-assisted virtualization and paravirtualization as well as multiple architectures. Studying guest systems with latency constraints with regards to virtualization showed that standard techniques such as high-frequency time-slicing are not a viable approach.
Generally, guest systems are a combination of best-effort and real-time work and thus form a mixed-criticality system. Further analysis showed that such systems need to export relevant internal scheduling information to the hypervisor to support multiple guests with latency constraints. I propose a mechanism to export those relevant events that is secure, flexible, has good performance and is easy to use. The thesis concludes with an evaluation covering the virtualization approach on the ARM and x86 architectures and two guest operating systems, Linux and FreeRTOS, as well as evaluating the export mechanism
System Support for Distributed Energy Management in Modular Operating Systems
This thesis proposes a novel approach for managing energy in modular operating systems. Our approach enables energy awareness if the resource-management subsystem is distributed among multiple operating-system modules. There are four key achievements: a model for modularization-aware energy management; the support for exposed and distributed energy accounting and allocation; the use of different energy-management interaction protocols; and, finally, the support virtualization of energy effects
Operating System Support for Redundant Multithreading
Failing hardware is a fact and trends in microprocessor design indicate that the fraction of hardware suffering from permanent and transient faults will continue to increase in future chip generations. Researchers proposed various solutions to this issue with different downsides: Specialized hardware components make hardware more expensive in production and consume additional energy at runtime. Fault-tolerant algorithms and libraries enforce specific programming models on the developer. Compiler-based fault tolerance requires the source code for all applications to be available for recompilation. In this thesis I present ASTEROID, an operating system architecture that integrates applications with different reliability needs.
ASTEROID is built on top of the L4/Fiasco.OC microkernel and extends the system with Romain, an operating system service that transparently replicates user applications. Romain supports single- and multi-threaded applications without requiring access to the application's source code. Romain replicates applications and their resources completely and thereby does not rely on hardware extensions, such as ECC-protected memory. In my thesis I describe how to efficiently implement replication as a form of redundant multithreading in software. I develop mechanisms to manage replica resources and to make multi-threaded programs behave deterministically for replication.
I furthermore present an approach to handle applications that use shared-memory channels with other programs. My evaluation shows that Romain provides 100% error detection and more than 99.6% error correction for single-bit flips in memory and general-purpose registers. At the same time, Romain's execution time overhead is below 14% for single-threaded applications running in triple-modular redundant mode. The last part of my thesis acknowledges that software-implemented fault tolerance methods often rely on the correct functioning of a certain set of hardware and software components, the Reliable Computing Base (RCB).
I introduce the concept of the RCB and discuss what constitutes the RCB of the ASTEROID system and other fault tolerance mechanisms. Thereafter I show three case studies that evaluate approaches to protecting RCB components and thereby aim to achieve a software stack that is fully protected against hardware errors
Recommended from our members
The Design, Implementation, and Evaluation of Software and Architectural Support for ARM Virtualization
The ARM architecture is dominating in the mobile and embedded markets and is making an upwards push into the server and networking markets where virtualization is a key technology. Similar to x86, ARM has added hardware support for virtualization, but there are important differences between the ARM and x86 architectural designs. Given two widely deployed computer architectures with different approaches to hardware virtualization support, we can evaluate, in practice, benefits and drawbacks of different approaches to architectural support for virtualization.
This dissertation explores new approaches to combining software and architectural support for virtualization with a focus on the ARM architecture and shows that it is possible to provide virtualization services an order of magnitude more efficiently than traditional implementations.
First, we investigate why the ARM architecture does not meet the classical requirements for virtualizable architectures and present an early prototype of KVM for ARM, a hypervisor using lightweight paravirtualization to run VMs on ARM systems without hardware virtualization support. Lightweight paravirtualization is a fully automated approach which replaces sensitive instructions with privileged instructions and requires no understanding of the guest OS code.
Second, we introduce split-mode virtualization to support hosted hypervisor designs using ARM's architectural support for virtualization. Different from x86, the ARM virtualization extensions are based on a new hypervisor CPU mode, separate from existing CPU modes. This separate hypervisor CPU mode does not support running existing unmodified OSes, and therefore hosted hypervisor designs, in which the hypervisor runs as part of a host OS, do not work on ARM. Split-mode virtualization splits the execution of the hypervisor such that the host OS with core hypervisor functionality runs in the existing kernel CPU mode, but a small runtime runs in the hypervisor CPU mode and supports switching between the VM and the host OS. Split-mode virtualization was used in KVM/ARM, which was designed from the ground up as an open source project and merged in the mainline Linux kernel, resulting in interesting lessons about translating research ideas into practice.
Third, we present an in-depth performance study of 64-bit ARMv8 virtualization using server hardware and compare against x86. We measure the performance of both standalone and hosted hypervisors on both ARM and x86 and compare their results. We find that ARM hardware support for virtualization can enable faster transitions between the VM and the hypervisor for standalone hypervisors compared to x86, but results in high switching overheads for hosted hypervisors compared to both x86 and to standalone hypervisors on ARM. We identify a key reason for high switching overhead for hosted hypervisors being the need to save and restore kernel mode state between the host OS kernel and the VM kernel. However, standalone hypervisors such as Xen, cannot leverage their performance benefit in practice for real application workloads. Other factors related to hypervisor software design and I/O emulation play a larger role in overall hypervisor performance than low-level interactions between the hypervisor and the hardware.
Fourth, realizing that modern hypervisors rely on running a full OS kernel, the hypervisor OS kernel, to support their hypervisor functionality, we present a new hypervisor design which runs the hypervisor and its hypervisor OS kernel in ARM's separate hypervisor CPU mode and avoids the need to multiplex kernel mode CPU state between the VM and the hypervisor. Our design benefits from new architectural features, the virtualization host extensions (VHE), in ARMv8.1 to avoid modifying the hypervisor OS kernel to run in the hypervisor CPU mode. We show that the hypervisor must be co-designed with the hardware features to take advantage of running in a separate CPU mode and implement our changes to KVM/ARM. We show that running the hypervisor OS kernel in a separate CPU mode from the VM and taking advantage of ARM's ability to quickly switch between the VM and hypervisor results in an order of magnitude reduction in overhead for important virtualization microbenchmarks and reduces the overhead of real application workloads by more than 50%
Microkernel mechanisms for improving the trustworthiness of commodity hardware
The thesis presents microkernel-based software-implemented mechanisms for improving the trustworthiness of computer systems based on commercial off-the-shelf (COTS) hardware that can malfunction when the hardware is impacted by transient hardware faults. The hardware anomalies, if undetected, can cause data corruptions, system crashes, and security vulnerabilities, significantly undermining system dependability. Specifically, we adopt the single event upset (SEU) fault model and address transient CPU or memory faults.
We take advantage of the functional correctness and isolation guarantee provided by the formally verified seL4 microkernel and hardware redundancy provided by multicore processors, design the redundant co-execution (RCoE) architecture that replicates a whole software system (including the microkernel) onto different CPU cores, and implement two variants, loosely-coupled redundant co-execution (LC-RCoE) and closely-coupled redundant co-execution (CC-RCoE), for the ARM and x86 architectures. RCoE treats each replica of the software system as a state machine and ensures that
the replicas start from the same initial state, observe consistent inputs, perform equivalent state transitions, and thus produce consistent outputs during error-free executions. Compared with other software-based error detection approaches, the distinguishing feature of RCoE is that the microkernel and device drivers are also included in redundant co-execution, significantly extending the sphere of replication (SoR).
Based on RCoE, we introduce two kernel mechanisms, fingerprint validation and kernel barrier timeout, detecting fault-induced execution divergences between the replicated systems, with the flexibility of tuning the error detection latency and coverage. The kernel error-masking mechanisms built on RCoE enable downgrading from triple modular redundancy (TMR) to dual modular redundancy (DMR) without service interruption. We run synthetic benchmarks and system benchmarks to evaluate the performance overhead of the approach, observe that the overhead varies based on the characteristics of workloads and the variants (LC-RCoE or CC-RCoE), and conclude that the approach is applicable for real-world applications. The effectiveness of the error detection mechanisms is assessed by conducting fault injection campaigns on real hardware, and the results demonstrate compelling improvement
Recommended from our members
CheriOS: Designing an untrusted single-address-space capability operating system utilising capability hardware and a minimal hypervisor
This thesis presents the design, implementation, and evaluation of a novel capability operating system: CheriOS. The guiding motivation behind CheriOS is to provide strong security guarantees to programmers, even allowing them to continue to program in fast, but typically unsafe, languages such as C. Furthermore, it does this in the presence of an extremely strong adversarial model: in CheriOS, every compartment -- and even the operating system itself -- is considered actively malicious. Building on top of the architecturally enforced capabilities offered by the CHERI microprocessor, I show that only a few more capability types and enforcement checks are required to provide a strong compartmentalisation model that can facilitate mutual distrust. I implement these new primitives in software, in a new abstraction layer I dub the nanokernel. Among the new OS primitives I introduce are one for integrity and confidentiality called a Reservation (which allows allocating private memory without trusting the allocator), as well as another that can provide attestation about the state of the system, a Foundation (which provides a key to sign and protect capabilities based on a signature of the starting state of a program). I show that, using these new facilities, it is possible to design an operating system without having to trust the implementation is correct.
CheriOS is fundamentally fail-safe; there are no assumptions about the behaviour of the system, apart from the CHERI processor and the nanokernel, to be broken. Using CHERI and the new nanokernel primitives, programmers can expect full isolation at scopes ranging from a whole program to a single function, and not just with respect to other programs but the system itself. Programs compiled for and run on CheriOS offer full memory safety, both spatial and temporal, enforced control flow integrity between compartments and protection against common vulnerabilities such as buffer overflows, code injection and Return-Oriented-Programming attacks. I achieve this by designing a new CHERI-based ABI (Application Binary Interface) which includes a novel stack structure that offers temporal safety. I evaluate how practical the new designs are by prototyping them and offering a detailed performance evaluation. I also contrast with existing offerings from both industry and academia.
CHERI capabilities can be used to restrict access to system resources, such as memory, with the required dynamic checks being performed by hardware in parallel with normal operation. Using the accelerating features of CHERI, I show that many of the security guarantees that CheriOS offers can come at little to no cost. I present a novel and secure IO/IPC layer that allows secure marshalling of multiple data streams through mutually distrusting compartments, with fine-grained authenticated access control for end-points, and without either copying or encryption. For example, CheriOS can restrict its TCP stack from having access to packet contents, or restrict an open socket to ensure data sent on it to arrives at an endpoint signed as a TLS implementation. Even with added security requirements, CheriOS can perform well on real workloads. I showcase this by running a state-of-the-art webserver, NGINX, atop both CheriOS and FreeBSD and show improvements in performance ranging from 3x to 6x when running on a small-scale low-power FPGA implementation of CHERI-MIPS
Ein mehrschichtiges sicheres Framework für Fahrzeugsysteme
In recent years, significant developments were introduced within the vehicular domain, evolving the vehicles to become a network of many embedded systems distributed throughout the car, known as Electronic Control Units (ECUs). Each one of these ECUs runs a number of software components that collaborate with each other to perform various vehicle functions. Modern vehicles are also equipped with wireless communication technologies, such as WiFi, Bluetooth, and so on, giving them the capability to interact with other vehicles and roadside infrastructure. While these improvements have increased the safety of the automotive system, they have vastly expanded the attack surface of the vehicle and opened the door for new potential security risks. The situation is made worse by a lack of security mechanisms in the vehicular system which allows the escalation of a compromise in one of the non-critical sub-systems to threaten the safety of the entire vehicle and its passengers. This dissertation focuses on providing a comprehensive framework that ensures the security of the vehicular system during its whole life-cycle. This framework aims to prevent the cyber-attacks against different components by ensuring secure communications among them. Furthermore, it aims to detect attacks which were not prevented successfully, and finally, to respond to these attacks properly to ensure a high degree of safety and stability of the system.In den letzten Jahren wurden bedeutende Entwicklungen im Bereich der Fahrzeuge vorgestellt, die die Fahrzeuge zu einem Netzwerk mit vielen im gesamten Fahrzeug verteile integrierte Systeme weiterentwickelten, den sogenannten Steuergeräten (ECU, englisch = Electronic Control Units). Jedes dieser Steuergeräte betreibt eine Reihe von Softwarekomponenten, die bei der Ausführung verschiedener Fahrzeugfunktionen zusammenarbeiten. Moderne Fahrzeuge sind auch mit drahtlosen Kommunikationstechnologien wie WiFi, Bluetooth usw. ausgestattet, die ihnen die Möglichkeit geben, mit anderen Fahrzeugen und der straßenseitigen Infrastruktur zu interagieren. Während diese Verbesserungen die Sicherheit des Fahrzeugsystems erhöht haben, haben sie die Angriffsfläche des Fahrzeugs erheblich vergrößert und die Tür für neue potenzielle Sicherheitsrisiken geöffnet. Die Situation wird durch einen Mangel an Sicherheitsmechanismen im Fahrzeugsystem verschärft, die es ermöglichen, dass ein Kompromiss in einem der unkritischen Subsysteme die Sicherheit des gesamten Fahrzeugs und seiner Insassen gefährdet kann. Diese Dissertation konzentriert sich auf die Entwicklung eines umfassenden Rahmens, der die Sicherheit des Fahrzeugsystems während seines gesamten Lebenszyklus gewährleistet. Dieser Rahmen zielt darauf ab, die Cyber-Angriffe gegen verschiedene Komponenten zu verhindern, indem eine sichere Kommunikation zwischen ihnen gewährleistet wird. Darüber hinaus zielt es darauf ab, Angriffe zu erkennen, die nicht erfolgreich verhindert wurden, und schließlich auf diese Angriffe angemessen zu reagieren, um ein hohes Maß an Sicherheit und Stabilität des Systems zu gewährleisten