9 research outputs found
Behind the Last Line of Defense -- Surviving SoC Faults and Intrusions
Today, leveraging the enormous modular power, diversity and flexibility of manycore systems-on-a-chip (SoCs) requires careful orchestration of complex resources, a task left to low-level software, e.g. hypervisors. In current architectures, this software forms a single point of failure and worthwhile target for attacks: once compromised, adversaries gain access to all information and full control over the platform and the environment it controls. This paper proposes Midir, an enhanced manycore architecture, effecting a paradigm shift from SoCs to distributed SoCs. Midir changes the way platform resources are controlled, by retrofitting tile-based fault containment through well known mechanisms, while securing low-overhead quorum-based consensus on all critical operations, in particular privilege management and, thus, management of containment domains. Allowing versatile redundancy management, Midir promotes resilience for all software levels, including at low level. We explain this architecture, its associated algorithms and hardware mechanisms and show, for the example of a Byzantine fault tolerant microhypervisor, that it outperforms the highly efficient MinBFT by one order of magnitude
Behind the Last Line of Defense -- Surviving SoC Faults and Intrusions
Today, leveraging the enormous modular power, diversity and flexibility of
manycore systems-on-a-chip (SoCs) requires careful orchestration of complex
resources, a task left to low-level software, e.g. hypervisors. In current
architectures, this software forms a single point of failure and worthwhile
target for attacks: once compromised, adversaries gain access to all
information and full control over the platform and the environment it controls.
This paper proposes Midir, an enhanced manycore architecture, effecting a
paradigm shift from SoCs to distributed SoCs. Midir changes the way platform
resources are controlled, by retrofitting tile-based fault containment through
well known mechanisms, while securing low-overhead quorum-based consensus on
all critical operations, in particular privilege management and, thus,
management of containment domains. Allowing versatile redundancy management,
Midir promotes resilience for all software levels, including at low level. We
explain this architecture, its associated algorithms and hardware mechanisms
and show, for the example of a Byzantine fault tolerant microhypervisor, that
it outperforms the highly efficient MinBFT by one order of magnitude
Behind the last line of defense: Surviving SoC faults and intrusions
Today, leveraging the enormous modular power, diversity and flexibility of manycore systems-on-a-chip (SoCs) requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure and worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls. This article proposes Midir, an enhanced manycore architecture, effecting a paradigm shift from SoCs to distributed SoCs. Midir changes the way platform resources are controlled, by retrofitting tile-based fault containment through well known mechanisms, while securing low-overhead quorum-based consensus on all critical operations, in particular privilege management and, thus, management of containment domains. Allowing versatile redundancy management, Midir promotes resilience for all software levels, including at low level. We explain this architecture, its associated algorithms and hardware mechanisms and show, for the example of a Byzantine fault tolerant microhypervisor, that it outperforms the highly efficient MinBFT by one order of magnitude
Reliability issues in the design of distributed object-based architectures
PhD ThesisThis thesis is aimed at enhancing the existing set of techniques for building
distributed systems, specifically from the point of view of fault-tolerant com-
puting.
Reliability is of fundamental importance in the design and operation of dis-
tributed systems, as an increasing number of computers are employed in the
automation of various essential services. In the past decade, much research
effort has been concerned with the object-based methodology for the design
and implementation of reliable distributed systems.
This thesis describes three contributions to this effort. First, it is shown
that object-based programming features can in fact be introduced into pro-
cedural languages provided that these languages are endowed with certain
facilities. Then, work is discussed which illustrates the relationship
between distributed object-based architectures and an apparently different
form of distributed architectures based on processes. This work puts the
notion of object-based architectures into a new perspective, which shows
that the object-based philosophy and the process-based philosophy are the
dual of each other.
Finally, an important aspect of the design of an object-based distributed
architecture is investigated, that of automatic garbage collection. A distri-
buted garbage collection scheme is described that handles fault tolerance by
an extension of the technique commonly employed to detect unwanted com-
putations in distributed architectures. The scheme proposed can also be
seen as yet a further illustration of the link between object-based and
process-based architectures.Royal Signals and Radar Establishment of the U.K.
Ministry of Defence.
Italian Consiglio Nazionale delle Ricerch
Architectural Support for Hypervisor-Level Intrusion Tolerance in MPSoCs
Increasingly, more aspects of our lives rely on the correctness and safety of computing systems, namely in the embedded and cyber-physical (CPS) domains, which directly affect the physical world. While systems have been pushed to their limits of functionality and efficiency, security threats and generic hardware quality have challenged their safety.
Leveraging the enormous modular power, diversity and flexibility of these systems, often deployed in multi-processor systems-on-chip (MPSoC), requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure (SPoF) and a worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls, for instance by means of privilege escalation and resource allocation. Currently, solutions to protect low-level software often rely on a simpler, underlying trusted layer which is often a SPoF itself and/or exhibits downgraded performance.
Architectural hybridization allows for the introduction of trusted-trustworthy components, which combined with fault and intrusion tolerance (FIT) techniques leveraging replication, are capable of safely handling critical operations, thus eliminating SPoFs. Performing quorum-based consensus on all critical operations, in particular privilege management, ensures no compromised low-level software can single handedly manipulate privilege escalation or resource allocation to negatively affect other system resources by propagating faults or further extend an adversary’s control. However, the performance impact of traditional Byzantine fault tolerant state-machine replication (BFT-SMR) protocols is prohibitive in the context of MPSoCs due to the high costs of cryptographic operations and the quantity of messages exchanged. Furthermore, fault isolation, one of the key prerequisites in FIT, presents a complicated challenge to tackle, given the whole system resides within one chip in such platforms.
There is so far no solution completely and efficiently addressing the SPoF issue in critical low-level management software. It is our aim, then, to devise such a solution that, additionally, reaps benefit of the tight-coupled nature of such manycore systems. In this thesis we present two architectures, using trusted-trustworthy mechanisms and consensus protocols, capable of protecting all software layers, specifically at low level, by performing critical operations only when a majority of correct replicas agree to their execution: iBFT and Midir. Moreover, we discuss ways in which these can be used at application level on the example of replicated applications sharing critical data structures. It then becomes possible to confine software-level faults and some hardware faults to the individual tiles of an MPSoC, converting tiles into fault containment domains, thus, enabling fault isolation and, consequently, making way to high-performance FIT at the lowest level
Architectural Support for Hypervisor-Level Intrusion Tolerance in MPSoCs
Increasingly, more aspects of our lives rely on the correctness and safety of computing systems, namely in the embedded and cyber-physical (CPS) domains, which directly affect the physical world. While systems have been pushed to their limits of functionality and efficiency, security threats and generic hardware quality have challenged their safety.
Leveraging the enormous modular power, diversity and flexibility of these systems, often deployed in multi-processor systems-on-chip (MPSoC), requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure (SPoF) and a worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls, for instance by means of privilege escalation and resource allocation. Currently, solutions to protect low-level software often rely on a simpler, underlying trusted layer which is often a SPoF itself and/or exhibits downgraded performance.
Architectural hybridization allows for the introduction of trusted-trustworthy components, which combined with fault and intrusion tolerance (FIT) techniques leveraging replication, are capable of safely handling critical operations, thus eliminating SPoFs. Performing quorum-based consensus on all critical operations, in particular privilege management, ensures no compromised low-level software can single handedly manipulate privilege escalation or resource allocation to negatively affect other system resources by propagating faults or further extend an adversary’s control. However, the performance impact of traditional Byzantine fault tolerant state-machine replication (BFT-SMR) protocols is prohibitive in the context of MPSoCs due to the high costs of cryptographic operations and the quantity of messages exchanged. Furthermore, fault isolation, one of the key prerequisites in FIT, presents a complicated challenge to tackle, given the whole system resides within one chip in such platforms.
There is so far no solution completely and efficiently addressing the SPoF issue in critical low-level management software. It is our aim, then, to devise such a solution that, additionally, reaps benefit of the tight-coupled nature of such manycore systems. In this thesis we present two architectures, using trusted-trustworthy mechanisms and consensus protocols, capable of protecting all software layers, specifically at low level, by performing critical operations only when a majority of correct replicas agree to their execution: iBFT and Midir. Moreover, we discuss ways in which these can be used at application level on the example of replicated applications sharing critical data structures. It then becomes possible to confine software-level faults and some hardware faults to the individual tiles of an MPSoC, converting tiles into fault containment domains, thus, enabling fault isolation and, consequently, making way to high-performance FIT at the lowest level
Modular redundancy in a message passing system
SIGLETIB: RN 4237 (197) / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman