22 research outputs found

    Invalidation-based protocols for replicated datastores

    Get PDF
    Distributed in-memory datastores underpin cloud applications that run within a datacenter and demand high performance, strong consistency, and availability. A key feature of datastores is data replication. The data are replicated across servers because a single server often cannot handle the request load. Replication is also necessary to guarantee that a server or link failure does not render a portion of the dataset inaccessible. A replication protocol is responsible for ensuring strong consistency between the replicas of a datastore, even when faults occur, by determining the actions necessary to access and manipulate the data. Consequently, a replication protocol also drives the datastore's performance. Existing strongly consistent replication protocols deliver fault tolerance but fall short in terms of performance. Meanwhile, the opposite occurs in the world of multiprocessors, where data are replicated across the private caches of different cores. The multiprocessor regime uses invalidations to afford strongly consistent replication with high performance but neglects fault tolerance. Although handling failures in the datacenter is critical for data availability, we observe that the common operation is fault-free and far exceeds the operation during faults. In other words, the common operating environment inside a datacenter closely resembles that of a multiprocessor. Based on this insight, we draw inspiration from the multiprocessor for high-performance, strongly consistent replication in the datacenter. The primary contribution of this thesis is in adapting invalidating protocols to the nuances of replicated datastores, which include skewed data accesses, fault tolerance, and distributed transactions

    Techniques for Identifying Elusive Corner-Case Bugs in Systems Software

    Get PDF
    Modern software is plagued by elusive corner-case bugs (e.g., security bugs). Because there are no scalable, automated ways of finding them, such bugs can remain hidden until software is deployed in production. This thesis proposes approaches to solve this problem. First, we present black-box and white-box fault injection mechanisms, which allow developers to test the behavior of their own code in the presence of failures in external components, e.g., in libraries, in the kernel, or in remote nodes of a distributed system. We describe how to make black-box fault injection more efficient, by prioritizing tests based on their estimated impact. For white-box testing, we proposed and implemented a technique to find Trojan messages in distributed systems, i.e., messages that are accepted as valid by receiver nodes, yet cannot be sent by any correct sender node. We show that Trojan messages can lead to subtle semantic bugs. We used fault injection techniques to find new bugs in systems such as the MySQL database, the Apache HTTP server, the FSP file service protocol suite, and the PBFT Byzantine-fault-tolerant replication library. Testing can find bugs and build confidence in the correctness of a system. However, exhaustive testing is often unfeasible, and therefore testing may not discover all bugs before a system is deployed. In the second part of this thesis, we describe how to automatically harden production systems, reducing the impact of any corner-case bugs missed by testing. We present a framework that reduces the overhead cost of instrumentation tools such as memory error detectors. Lowering the cost enables system developers to use such tools in production to harden their systems, reducing the impact of any remaining corner-case bugs. We used our framework to generate a version of the Linux kernel hardened with Address Sanitizer. Our hardened kernel has most of the benefit of full instrumentation: it detects the same vulnerabilities as full instrumentation (7 out of 11 privilege escalation exploits from 2013-2014 can be detected using instrumentation tools). Yet, it obtains these benefits at only a quarter of the overhead

    Architectural Support for Hypervisor-Level Intrusion Tolerance in MPSoCs

    Get PDF
    Increasingly, more aspects of our lives rely on the correctness and safety of computing systems, namely in the embedded and cyber-physical (CPS) domains, which directly affect the physical world. While systems have been pushed to their limits of functionality and efficiency, security threats and generic hardware quality have challenged their safety. Leveraging the enormous modular power, diversity and flexibility of these systems, often deployed in multi-processor systems-on-chip (MPSoC), requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure (SPoF) and a worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls, for instance by means of privilege escalation and resource allocation. Currently, solutions to protect low-level software often rely on a simpler, underlying trusted layer which is often a SPoF itself and/or exhibits downgraded performance. Architectural hybridization allows for the introduction of trusted-trustworthy components, which combined with fault and intrusion tolerance (FIT) techniques leveraging replication, are capable of safely handling critical operations, thus eliminating SPoFs. Performing quorum-based consensus on all critical operations, in particular privilege management, ensures no compromised low-level software can single handedly manipulate privilege escalation or resource allocation to negatively affect other system resources by propagating faults or further extend an adversary’s control. However, the performance impact of traditional Byzantine fault tolerant state-machine replication (BFT-SMR) protocols is prohibitive in the context of MPSoCs due to the high costs of cryptographic operations and the quantity of messages exchanged. Furthermore, fault isolation, one of the key prerequisites in FIT, presents a complicated challenge to tackle, given the whole system resides within one chip in such platforms. There is so far no solution completely and efficiently addressing the SPoF issue in critical low-level management software. It is our aim, then, to devise such a solution that, additionally, reaps benefit of the tight-coupled nature of such manycore systems. In this thesis we present two architectures, using trusted-trustworthy mechanisms and consensus protocols, capable of protecting all software layers, specifically at low level, by performing critical operations only when a majority of correct replicas agree to their execution: iBFT and Midir. Moreover, we discuss ways in which these can be used at application level on the example of replicated applications sharing critical data structures. It then becomes possible to confine software-level faults and some hardware faults to the individual tiles of an MPSoC, converting tiles into fault containment domains, thus, enabling fault isolation and, consequently, making way to high-performance FIT at the lowest level

    Architectural Support for Hypervisor-Level Intrusion Tolerance in MPSoCs

    Get PDF
    Increasingly, more aspects of our lives rely on the correctness and safety of computing systems, namely in the embedded and cyber-physical (CPS) domains, which directly affect the physical world. While systems have been pushed to their limits of functionality and efficiency, security threats and generic hardware quality have challenged their safety. Leveraging the enormous modular power, diversity and flexibility of these systems, often deployed in multi-processor systems-on-chip (MPSoC), requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure (SPoF) and a worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls, for instance by means of privilege escalation and resource allocation. Currently, solutions to protect low-level software often rely on a simpler, underlying trusted layer which is often a SPoF itself and/or exhibits downgraded performance. Architectural hybridization allows for the introduction of trusted-trustworthy components, which combined with fault and intrusion tolerance (FIT) techniques leveraging replication, are capable of safely handling critical operations, thus eliminating SPoFs. Performing quorum-based consensus on all critical operations, in particular privilege management, ensures no compromised low-level software can single handedly manipulate privilege escalation or resource allocation to negatively affect other system resources by propagating faults or further extend an adversary’s control. However, the performance impact of traditional Byzantine fault tolerant state-machine replication (BFT-SMR) protocols is prohibitive in the context of MPSoCs due to the high costs of cryptographic operations and the quantity of messages exchanged. Furthermore, fault isolation, one of the key prerequisites in FIT, presents a complicated challenge to tackle, given the whole system resides within one chip in such platforms. There is so far no solution completely and efficiently addressing the SPoF issue in critical low-level management software. It is our aim, then, to devise such a solution that, additionally, reaps benefit of the tight-coupled nature of such manycore systems. In this thesis we present two architectures, using trusted-trustworthy mechanisms and consensus protocols, capable of protecting all software layers, specifically at low level, by performing critical operations only when a majority of correct replicas agree to their execution: iBFT and Midir. Moreover, we discuss ways in which these can be used at application level on the example of replicated applications sharing critical data structures. It then becomes possible to confine software-level faults and some hardware faults to the individual tiles of an MPSoC, converting tiles into fault containment domains, thus, enabling fault isolation and, consequently, making way to high-performance FIT at the lowest level

    Algorithmic Regulation using AI and Blockchain Technology

    Get PDF
    This thesis investigates the application of AI and blockchain technology to the domain of Algorithmic Regulation. Algorithmic Regulation refers to the use of intelligent systems for the enabling and enforcement of regulation (often referred to as RegTech in financial services). The research work focuses on three problems: a) Machine interpretability of regulation; b) Regulatory reporting of data; and c) Federated analytics with data compliance. Uniquely, this research was designed, implemented, tested and deployed in collaboration with the Financial Conduct Authority (FCA), Santander, RegulAItion and part funded by the InnovateUK RegNet project. I am a co-founder of RegulAItion. / Using AI to Automate the Regulatory Handbook: In this investigation we propose the use of reasoning systems for encoding financial regulation as machine readable and executable rules. We argue that our rules-based “white-box” approach is needed, as opposed to a “black-box” machine learning approach, as regulators need explainability and outline the theoretical foundation needed to encode regulation from the FCA Handbook into machine readable semantics. We then present the design and implementation of a production-grade regulatory reasoning system built on top of the Java Expert System Shell (JESS) and use it to encode a subset of regulation (consumer credit regulation) from the FCA Handbook. We then perform an empirical evaluation, with the regulator, of the system based on its performance and accuracy in handling 600 “real- world” queries and compare it with its human equivalent. The findings suggest that the proposed approach of using reasoning systems not only provides quicker responses, but also more accurate results to answers from queries that are explainable. / SmartReg: Using Blockchain for Regulatory Reporting: In this investigation we explore the use of distributed ledgers for real-time reporting of data for compliance between firms and regulators. Regulators and firms recognise the growing burden and complexity of regulatory reporting resulting from the lack of data standardisation, increasing complexity of regulation and the lack of machine executable rules. The investigation presents a) the design and implementation of a permissioned Quorum-Ethereum based regulatory reporting network that makes use of an off-chain reporting service to execute machine readable rules on banks’ data through smart contracts b) a means for cross border regulators to share reporting data with each other that can be used to given them a true global view of systemic risk c) a means to carry out regulatory reporting using a novel pull-based approach where the regulator is able to directly “pull” relevant data out of the banks’ environments in an ad-hoc basis- enabling regulators to become more active when addressing risk. We validate the approach and implementation of our system through a pilot use case with a bank and regulator. The outputs of this investigation have informed the Digital Regulatory Reporting initiative- an FCA and UK Government led project to improve regulatory reporting in the financial services. / RegNet: Using Federated Learning and Blockchain for Privacy Preserving Data Access In this investigation we explore the use of Federated Machine Learning and Trusted data access for analytics. With the development of stricter Data Regulation (e.g. GDPR) it is increasingly difficult to share data for collective analytics in a compliant manner. We argue that for data compliance, data does not need to be shared but rather, trusted data access is needed. The investigation presents a) the design and implementation of RegNet- an infrastructure for trusted data access in a secure and privacy preserving manner for a singular algorithmic purpose, where the algorithms (such as Federated Learning) are orchestrated to run within the infrastructure of data owners b) A taxonomy for Federated Learning c) The tokenization and orchestration of Federated Learning through smart contracts for auditable governance. We validate our approach and the infrastructure (RegNet) through a real world use case, involving a number of banks, that makes use of Federated Learning with Epsilon-Differential Privacy for improving the performance of an Anti-Money-Laundering classification model

    Bringing Order into Things Decentralized and Scalable Ledgering for the Internet-of-Things

    Get PDF
    The Internet-of-Things (IoT) is simultaneously the largest and the fastest growing distributed system known to date. With the expectation of 50 billion of devices coming online by 2020, far surpassing the size of the human population, problems related to scale, trustability and security are anticipated. Current IoT architectures are inherently flawed as they are centralized on the cloud and explore fragile trust-based relationships over a plethora of loosely integrated devices, leading to IoT platforms being non-robust for every party involved and unable to scale properly in the near future. The need for a new architecture that addresses these concerns is urgent as the IoT is progressively more ubiquitous, pervasive and demanding regarding the integration of devices and processing of data increasingly susceptible to reliability and security issues. In this thesis, we propose a decentralized ledgering solution for the IoT, leveraging a recent concept: blockchains. Rather than replacing the cloud, our solution presents a scalable and fault-tolerant middleware for recording transactions between peers, under verifiable and decentralized trustability assumptions and authentication guarantees for IoT devices, cloud services and users. Following on the emergent trend in modern IoT architectures, we leverage smart hubs as blockchain gateways, aggregating, pre-processing and forwarding small amounts of data and transactions in proximity conditions, that will be verified and processed as transactions in the blockchain. The proposed middleware acts as a secure ledger and establishes private channels between peers, requiring transactions in the blockchain to be signed using threshold signature schemes and grouporiented verification properties. The approach improves the decentralization and robustness characteristics under Byzantine fault-tolerance settings, while preserving the blockchain distributed nature
    corecore