599 research outputs found

    TANDEM: taming failures in next-generation datacenters with emerging memory

    Get PDF
    The explosive growth of online services, leading to unforeseen scales, has made modern datacenters highly prone to failures. Taming these failures hinges on fast and correct recovery, minimizing service interruptions. Applications, owing to recovery, entail additional measures to maintain a recoverable state of data and computation logic during their failure-free execution. However, these precautionary measures have severe implications on performance, correctness, and programmability, making recovery incredibly challenging to realize in practice. Emerging memory, particularly non-volatile memory (NVM) and disaggregated memory (DM), offers a promising opportunity to achieve fast recovery with maximum performance. However, incorporating these technologies into datacenter architecture presents significant challenges; Their distinct architectural attributes, differing significantly from traditional memory devices, introduce new semantic challenges for implementing recovery, complicating correctness and programmability. Can emerging memory enable fast, performant, and correct recovery in the datacenter? This thesis aims to answer this question while addressing the associated challenges. When architecting datacenters with emerging memory, system architects face four key challenges: (1) how to guarantee correct semantics; (2) how to efficiently enforce correctness with optimal performance; (3) how to validate end-to-end correctness including recovery; and (4) how to preserve programmer productivity (Programmability). This thesis aims to address these challenges through the following approaches: (a) defining precise consistency models that formally specify correct end-to-end semantics in the presence of failures (consistency models also play a crucial role in programmability); (b) developing new low-level mechanisms to efficiently enforce the prescribed models given the capabilities of emerging memory; and (c) creating robust testing frameworks to validate end-to-end correctness and recovery. We start our exploration with non-volatile memory (NVM), which offers fast persistence capabilities directly accessible through the processor’s load-store (memory) interface. Notably, these capabilities can be leveraged to enable fast recovery for Log-Free Data Structures (LFDs) while maximizing performance. However, due to the complexity of modern cache hierarchies, data hardly persist in any specific order, jeop- ardizing recovery and correctness. Therefore, recovery needs primitives that explicitly control the order of updates to NVM (known as persistency models). We outline the precise specification of a novel persistency model – Release Persistency (RP) – that provides a consistency guarantee for LFDs on what remains in non-volatile memory upon failure. To efficiently enforce RP, we propose a novel microarchitecture mechanism, lazy release persistence (LRP). Using standard LFDs benchmarks, we show that LRP achieves fast recovery while incurring minimal overhead on performance. We continue our discussion with memory disaggregation which decouples memory from traditional monolithic servers, offering a promising pathway for achieving very high availability in replicated in-memory data stores. Achieving such availability hinges on transaction protocols that can efficiently handle recovery in this setting, where compute and memory are independent. However, there is a challenge: disaggregated memory (DM) fails to work with RPC-style protocols, mandating one-sided transaction protocols. Exacerbating the problem, one-sided transactions expose critical low-level ordering to architects, posing a threat to correctness. We present a highly available transaction protocol, Pandora, that is specifically designed to achieve fast recovery in disaggregated key-value stores (DKVSes). Pandora is the first one-sided transactional protocol that ensures correct, non-blocking, and fast recovery in DKVS. Our experimental implementation artifacts demonstrate that Pandora achieves fast recovery and high availability while causing minimal disruption to services. Finally, we introduce a novel target litmus-testing framework – DART – to validate the end-to-end correctness of transactional protocols with recovery. Using DART’s target testing capabilities, we have found several critical bugs in Pandora, highlighting the need for robust end-to-end testing methods in the design loop to iteratively fix correctness bugs. Crucially, DART is lightweight and black-box, thereby eliminating any intervention from the programmers

    GAC-MAC-SGA 2023 Sudbury Meeting: Abstracts, Volume 46

    Get PDF

    Distributed Transactions using the SAGA pattern

    Get PDF
    Οι κατανεμημένες συναλλαγές αποτελούν κρίσιμο συστατικό των σύγχρονων κατανεμημένων συστημάτων, καθώς εξασφαλίζουν τη συνοχή και την ακεραιότητα των δεδομένων σε πολλαπλές βάσεις δεδομένων, διακομιστές ή υπηρεσίες. Είναι ιδιαίτερα σημαντικές στα σύγχρονα κατανεμημένα συστήματα, τα οποία είναι συχνά πολύπλοκα, δυναμικά με πολλά διασυνδεδεμένα στοιχεία, καθώς και σε συστήματα που περιλαμβάνουν ευαίσθητα ή κρίσιμα δεδομένα. Χωρίς κατανεμημένες συναλλαγές, θα ήταν δύσκολο να διασφαλιστεί ότι τα δεδομένα ενημερώνονται με συνέπεια και ορθότητα σε πολλαπλά συστήματα, γεγονός που θα μπορούσε να οδηγήσει σε ασυνέπειες και απώλεια δεδομένων. Ο στόχος μιας κατανεμημένης συναλλαγής είναι να διασφαλίσει ότι όλες οι λειτουργίες είτε όλες δεσμεύονται και τίθενται σε ισχύ είτε όλες ανακαλούνται και δεν έχουν κανένα αποτέλεσμα, ακόμη και σε περίπτωση αποτυχιών ή σφαλμάτων. Ωστόσο, οι παραδοσιακές προσεγγίσεις για την υλοποίηση κατανεμημένων συναλλαγών, όπως το Two-Phase Commit protocol (2PC), μπορεί να είναι πολύπλοκες και επιρρεπείς σε σφάλματα. Το πρότυπο SAGA είναι μια πολλά υποσχόμενη εναλλακτική προσέγγιση για την υλοποίηση κατανεμημένων συναλλαγών, καθώς επιτρέπει μεγαλύτερη ευελιξία και ανθεκτικότητα στα κατανεμημένα συστήματα. Σε ένα SAGA, κάθε βήμα αντιμετωπίζεται ως ξεχωριστή συναλλαγή, και εάν κάποιο βήμα αποτύχει, η διαδικασία ανατρέπεται σε μια προηγούμενη γνωστή καλή κατάσταση και ενεργοποιείται μια διαδικασία χειρισμού σφαλμάτων. Αυτό επιτρέπει τη μερική αποτυχία και τη δυνατότητα ανάκαμψης από αυτήν, καθιστώντας το πρότυπο SAGA ιδιαίτερα κατάλληλο για χρήση σε σύγχρονα κατανεμημένα συστήματα. Σε αυτή τη διπλωματική εργασία, θα διερευνήσουμε τη χρήση του προτύπου Saga για την υλοποίηση κατανεμημένων συναλλαγών σε κατανεμημένα συστήματα. Θα ξεκινήσουμε παρέχοντας μια επισκόπηση των κατανεμημένων συναλλαγών και των προκλήσεων που θέτουν στα σύγχρονα κατανεμημένα συστήματα. Στη συνέχεια θα παρουσιάσουμε το πρότυπο SAGA και θα συζητήσουμε τα οφέλη και τις προκλήσεις του. Αμέσως μετά, θα παρουσιάσουμε μια μελέτη περίπτωσης που υλοποιεί την «orchestrated» προσέγγισή μας - piSaga - για κατανεμημένες συναλλαγές σε Spring Boot microservices και θα πραγματοποιήσουμε ορισμένα πειράματα προκειμένου να μετρήσουμε την απόδοσή της σε σχέση με μια «choreographed» λύση. Τέλος, θα καταλήξουμε συνοψίζοντας τις βασικές ιδέες και τις συνεισφορές της ανάλυσής μας.Distributed transactions are a crucial component of modern distributed systems, as they ensure data consistency and integrity across multiple databases, servers, or services. They are essential in modern distributed systems, which are often complex and dynamic environments with many interconnected components, and in systems that involve sensitive or critical data. Without distributed transactions, ensuring that data is consistently and correctly updated across multiple systems would be challenging, which could lead to inconsistencies and data loss. The goal of a distributed transaction is to ensure that all operations either commit and take effect or roll back and have no effect, even in the case of failures or errors. However, traditional approaches to implementing distributed transactions, such as the Two-Phase commit (2PC) protocol, can be inflexible and prone to failure in complex and dynamic environments. The SAGA pattern is a promising alternative approach to implementing distributed transactions, as it allows for more flexibility and resilience in distributed systems. In a SAGA, each step in a long-running business process is treated as a separate transaction, and if any step fails, the process is rolled back to a previously known good state, and an error-handling process is triggered. This allows for partial failures and the ability to recover from them, making the SAGA pattern particularly well-suited for use in modern distributed systems. In this thesis, we will explore the use of the SAGA pattern for implementing distributed transactions in distributed systems. We will begin by providing an overview of distributed transactions and their challenges in modern distributed systems. We will then introduce the SAGA pattern and discuss its benefits and challenges. Next, we will present a case study that implements our orchestrated approach - piSaga - for distributed transactions in Spring Boot microservices. We will also conduct experiments to measure its performance against a choreographed solution. Finally, we will summarize the key insights and contributions of our analysis

    Scalable and fault-tolerant data stream processing on multi-core architectures

    Get PDF
    With increasing data volumes and velocity, many applications are shifting from the classical “process-after-store” paradigm to a stream processing model: data is produced and consumed as continuous streams. Stream processing captures latency-sensitive applications as diverse as credit card fraud detection and high-frequency trading. These applications are expressed as queries of algebraic operations (e.g., aggregation) over the most recent data using windows, i.e., finite evolving views over the input streams. To guarantee correct results, streaming applications require precise window semantics (e.g., temporal ordering) for operations that maintain state. While high processing throughput and low latency are performance desiderata for stateful streaming applications, achieving both poses challenges. Computing the state of overlapping windows causes redundant aggregation operations: incremental execution (i.e., reusing previous results) reduces latency but prevents parallelization; at the same time, parallelizing window execution for stateful operations with precise semantics demands ordering guarantees and state access coordination. Finally, streams and state must be recovered to produce consistent and repeatable results in the event of failures. Given the rise of shared-memory multi-core CPU architectures and high-speed networking, we argue that it is possible to address these challenges in a single node without compromising window semantics, performance, or fault-tolerance. In this thesis, we analyze, design, and implement stream processing engines (SPEs) that achieve high performance on multi-core architectures. To this end, we introduce new approaches for in-memory processing that address the previous challenges: (i) for overlapping windows, we provide a family of window aggregation techniques that enable computation sharing based on the algebraic properties of aggregation functions; (ii) for parallel window execution, we balance parallelism and incremental execution by developing abstractions for both and combining them to a novel design; and (iii) for reliable single-node execution, we enable strong fault-tolerance guarantees without sacrificing performance by reducing the required disk I/O bandwidth using a novel persistence model. We combine the above to implement an SPE that processes hundreds of millions of tuples per second with sub-second latencies. These results reveal the opportunity to reduce resource and maintenance footprint by replacing cluster-based SPEs with single-node deployments.Open Acces

    How can we ground ourselves in care and dance our revolution?

    Get PDF
    "How Can We Ground Ourselves in Care and Dance Our Revolution?" contains interviews with 141 activists in 63 countries speaking on how they integrate care into their activism and work. The report also calls on funders to provide more resources to fund collective care practices among feminist movements, directly providing recommendations, and next steps from activists on the frontlines. The perspectives shared in our report offer a unique and diverse understanding of collective care, highlighting the experiences of women, trans, and non-binary feminist activists who have been historically marginalized within social justice movements worldwide

    Optimistic Adaptation of Decentralised Role-based Software Systems

    Get PDF
    The complexity of computer networks has been rising over the last decades. Increasing interconnectivity between multiple devices, growing complexity of performed tasks and a strong collaboration between nodes are drivers for this phenomenon. An example is represented by Internet-of-Things devices, whose relevance has been rising in recent years. The increasing number of devices requiring updates and supervision makes maintenance more difficult. Human interaction, in this case, is costly and requires a lot of time. To overcome this, self-adaptive software systems (SAS) can be used. SAS are a subset of autonomous systems which can monitor themselves and their environment to adapt to changes without human interaction. In the literature, different approaches for engineering SAS were proposed, including techniques for executing adaptations on multiple devices based on generated plans for reacting to changes. Among those solutions, also decentralised approaches can be found. To the best of our knowledge, no approach for engineering a SAS exists which tolerates errors during the execution of adaptation in a decentralised setting. While some approaches for role-based execution reset the application in case of a single failure during the adaptation process, others do not make assumptions about errors or do not consider an erroneous environment. In a real-world environment, errors will likely occur during run-time, and the adaptation process could be disturbed. This work aims to perform adaptations in a decentralised way on role-based systems with a relaxed consistency constraint, i.e., errors during the adaptation phase are tolerated. This increases the availability of nodes since no rollbacks are required in case of a failure. Moreover, a subset of applications, such as drone swarms, would benefit from an approach with a relaxed consistency model since parts of the system that adapted successfully can already operate in an adapted configuration instead of waiting for other peers to apply the changes in a later iteration. Moreover, if we eliminate the need for an atomic adaptation execution, asynchronous execution of adaptation would be possible. In that case, we can supervise the adaptation process for a long time and ensure that every peer takes the planned actions as soon as the internal task execution allows it. To allow for a relaxed consistent way of adaptation execution, we develop a decentralised adaptation execution protocol, which supports the notion of eventual consistency. As soon as devices reconnect after network congestion or restore their internal state after local failures, our protocol can coordinate the recovery process among multiple devices to attempt recovery of a globally consistent state after errors occur. By superseding the need for a central instance, every peer who received information about failing peers can start the recovery process. The developed approach can restore a consistent global configuration if almost all peers fail. Moreover, the approach supports asynchronous adaptations, i.e., the peers can execute planned adaptations as soon as they are ready, which increases overall availability in case of delayed adaptation of single nodes. The developed protocol is evaluated with the help of a proof-of-concept implementation. The approach was run in five different experiments with thousands of iterations to show the applicability and reliability of this novel approach. The time for execution of the protocol and the number of exchanged messages has been measured to compare the protocol for different error cases and system sizes, as well as to show the scalability of the approach. The developed solution has been compared to a blocking approach to show the feasibility compared to an atomic approach. The applicability in a real-world scenario has been described in an empirical study using an example of a fire-extinguishing drone swarm. The results show that an optimistic approach to adaptation is suitable and specific scenarios can benefit from the improved availability since no rollbacks are required. Systems can continue their work regardless of the failures of participating nodes in large-scale systems.:Abstract VI 1. Introduction 1 1.1. Motivational Use-Case 2 1.2. Problem Definition 3 1.3. Objectives 4 1.4. Research Questions 5 1.5. Contributions 5 1.6. Outline 6 2. Foundation 7 2.1. Role Concept 7 2.2. Self-Adaptive Software Systems 13 2.3. Terminology for Role-Based Self-Adaptation 15 2.4. Consistency Preservation and Consistency Models 17 2.5. Summary 20 3. Related Work 21 3.1. Role-Based Approaches 22 3.2. Actor Model of Computation and Akka 23 3.3. Adaptation Execution in Self-Adaptive Software Systems 24 3.4. Change Consistency in Distributed Systems 33 3.5. Comparison of the Evaluated Approaches 40 4. The Decentralised Consistency Compensation Protocol 43 4.1. System and Error Model 43 4.2. Requirements to the Concept 44 4.3. The Usage of Roles in Adaptations 45 4.4. Protocol Overview 47 4.5. Protocol Description 51 4.6. Protocol Corner- and Error Cases 64 4.7. Summary 66 5. Prototypical Implementation 67 5.1. Technology Overview 67 5.2. Reused Artifacts 68 5.3. Implementation Details 70 5.4. Setup of the Prototypical Implementation 76 5.5. Summary 77 6. Evaluation 79 6.1. Evaluation Methodology 79 6.2. Evaluation Setup 80 6.3. Experiment Overview 81 6.4. Default Case: Successful Adaptation 84 6.5. Compensation on Disconnection of Peers 85 6.6. Recovery from Failed Adaptation 88 6.7. Impact of Early Activation of Adaptations 91 6.8. Comparison with a Blocking Approach 92 6.9. Empirical Study: Fire Extinguishing Drones 95 6.10. Summary 97 7. Conclusion and Future Work 99 7.1. Recap of the Research Questions 99 7.2. Discussion 101 7.3. Future Work 101 A. Protocol Buffer Definition 103 Acronyms 108 Bibliography 10

    Merging Queries in OLTP Workloads

    Get PDF
    OLTP applications are usually executed by a high number of clients in parallel and are typically faced with high throughput demand as well as a constraint latency requirement for individual statements. In enterprise scenarios, they often face the challenge to deal with overload spikes resulting from events such as Cyber Monday or Black Friday. The traditional solution to prevent running out of resources and thus coping with such spikes is to use a significant over-provisioning of the underlying infrastructure. In this thesis, we analyze real enterprise OLTP workloads with respect to statement types, complexity, and hot-spot statements. Interestingly, our findings reveal that workloads are often read-heavy and comprise similar query patterns, which provides a potential to share work of statements belonging to different transactions. In the past, resource sharing has been extensively studied for OLAP workloads. Naturally, the question arises, why studies mainly focus on OLAP and not on OLTP workloads? At first sight, OLTP queries often consist of simple calculations, such as index look-ups with little sharing potential. In consequence, such queries – due to their short execution time – may not have enough potential for the additional overhead. In addition, OLTP workloads do not only execute read operations but also updates. Therefore, sharing work needs to obey transactional semantics, such as the given isolation level and read-your-own-writes. This thesis presents THE LEVIATHAN, a novel batching scheme for OLTP workloads, an approach for merging read statements within interactively submitted multi-statement transactions consisting of reads and updates. Our main idea is to merge the execution of statements by merging their plans, thus being able to merge the execution of not only complex, but also simple calculations, such as the aforementioned index look-up. We identify mergeable statements by pattern matching of prepared statement plans, which comes with low overhead. For obeying the isolation level properties and providing read-your-own-writes, we first define a formal framework for merging transactions running under a given isolation level and provide insights into a prototypical implementation of merging within a commercial database system. Our experimental evaluation shows that, depending on the isolation level, the load in the system, and the read-share of the workload, an improvement of the transaction throughput by up to a factor of 2.5x is possible without compromising the transactional semantics. Another interesting effect we show is that with our strategy, we can increase the throughput of a real enterprise workload by 20%.:1 INTRODUCTION 1.1 Summary of Contributions 1.2 Outline 2 WORKLOAD ANALYSIS 2.1 Analyzing OLTP Benchmarks 2.1.1 YCSB 2.1.2 TATP 2.1.3 TPC Benchmark Scenarios 2.1.4 Summary 2.2 Analyzing OLTP Workloads from Open Source Projects 2.2.1 Characteristics of Workloads 2.2.2 Summary 2.3 Analyzing Enterprise OLTP Workloads 2.3.1 Overview of Reports about OLTP Workload Characteristics 2.3.2 Analysis of SAP Hybris Workload 2.3.3 Summary 2.4 Conclusion 3 RELATED WORK ON QUERY MERGING 3.1 Merging the Execution of Operators 3.2 Merging the Execution of Subplans 3.3 Merging the Results of Subplans 3.4 Merging the Execution of Full Plans 3.5 Miscellaneous Works on Merging 3.6 Discussion 4 MERGING STATEMENTS IN MULTI STATEMENT TRANSACTIONS 4.1 Overview of Our Approach 4.1.1 Examples 4.1.2 Why Naïve Merging Fails 4.2 THE LEVIATHAN Approach 4.3 Formalizing THE LEVIATHAN Approach 4.3.1 Transaction Theory 4.3.2 Merging Under MVCC 4.4 Merging Reads Under Different Isolation Levels 4.4.1 Read Uncommitted 4.4.2 Read Committed 4.4.3 Repeatable Read 4.4.4 Snapshot Isolation 4.4.5 Serializable 4.4.6 Discussion 4.5 Merging Writes Under Different Isolation Levels 4.5.1 Read Uncommitted 4.5.2 Read Committed 4.5.3 Snapshot Isolation 4.5.4 Serializable 4.5.5 Handling Dependencies 4.5.6 Discussion 5 SYSTEM MODEL 5.1 Definition of the Term “Overload” 5.2 Basic Queuing Model 5.2.1 Option (1): Replacement with a Merger Thread 5.2.2 Option (2): Adding Merger Thread 5.2.3 Using Multiple Merger Threads 5.2.4 Evaluation 5.3 Extended Queue Model 5.3.1 Option (1): Replacement with a Merger Thread 5.3.2 Option (2): Adding Merger Thread 5.3.3 Evaluation 6 IMPLEMENTATION 6.1 Background: SAP HANA 6.2 System Design 6.2.1 Read Committed 6.2.2 Snapshot Isolation 6.3 Merger Component 6.3.1 Overview 6.3.2 Dequeuing 6.3.3 Merging 6.3.4 Sending 6.3.5 Updating MTx State 6.4 Challenges in the Implementation of Merging Writes 6.4.1 SQL String Implementation 6.4.2 Update Count 6.4.3 Error Propagation 6.4.4 Abort and Rollback 7 EVALUATION 7.1 Benchmark Settings 7.2 System Settings 7.2.1 Experiment I: End-to-end Response Time Within a SAP Hybris System 7.2.2 Experiment II: Dequeuing Strategy 7.2.3 Experiment III: Merging Improvement on Different Statement, Transaction and Workload Types 7.2.4 Experiment IV: End-to-End Latency in YCSB 7.2.5 Experiment V: Breakdown of Execution in YCSB 7.2.6 Discussion of System Settings 7.3 Merging in Interactive Transactions 7.3.1 Experiment VI: Merging TATP in Read Uncommitted 7.3.2 Experiment VII: Merging TATP in Read Committed 7.3.3 Experiment VIII: Merging TATP in Snapshot Isolation 7.4 Merging Queries in Stored Procedures Experiment IX: Merging TATP Stored Procedures in Read Committed 7.5 Merging SAP Hybris 7.5.1 Experiment X: CPU-time Breakdown on HANA Components 7.5.2 Experiment XI: Merging Media Query in SAP Hybris 7.5.3 Discussion of our Results in Comparison with Related Work 8 CONCLUSION 8.1 Summary 8.2 Future Research Directions REFERENCES A UML CLASS DIAGRAM

    EEMARQ: Efficient Lock-Free Range Queries with Memory Reclamation

    Get PDF
    Multi-Version Concurrency Control (MVCC) is a common mechanism for achieving linearizable range queries in database systems and concurrent data-structures. The core idea is to keep previous versions of nodes to serve range queries, while still providing atomic reads and updates. Existing concurrent data-structure implementations, that support linearizable range queries, are either slow, use locks, or rely on blocking reclamation schemes. We present EEMARQ, the first scheme that uses MVCC with lock-free memory reclamation to obtain a fully lock-free data-structure supporting linearizable inserts, deletes, contains, and range queries. Evaluation shows that EEMARQ outperforms existing solutions across most workloads, with lower space overhead and while providing full lock freedom

    Patient-Specific Implants in Musculoskeletal (Orthopedic) Surgery

    Get PDF
    Most of the treatments in medicine are patient specific, aren’t they? So why should we bother with individualizing implants if we adapt our therapy to patients anyway? Looking at the neighboring field of oncologic treatment, you would not question the fact that individualization of tumor therapy with personalized antibodies has led to the thriving of this field in terms of success in patient survival and positive responses to alternatives for conventional treatments. Regarding the latest cutting-edge developments in orthopedic surgery and biotechnology, including new imaging techniques and 3D-printing of bone substitutes as well as implants, we do have an armamentarium available to stimulate the race for innovation in medicine. This Special Issue of Journal of Personalized Medicine will gather all relevant new and developed techniques already in clinical practice. Examples include the developments in revision arthroplasty and tumor (pelvic replacement) surgery to recreate individual defects, individualized implants for primary arthroplasty to establish physiological joint kinematics, and personalized implants in fracture treatment, to name but a few

    Trusted Artificial Intelligence in Manufacturing; Trusted Artificial Intelligence in Manufacturing

    Get PDF
    The successful deployment of AI solutions in manufacturing environments hinges on their security, safety and reliability which becomes more challenging in settings where multiple AI systems (e.g., industrial robots, robotic cells, Deep Neural Networks (DNNs)) interact as atomic systems and with humans. To guarantee the safe and reliable operation of AI systems in the shopfloor, there is a need to address many challenges in the scope of complex, heterogeneous, dynamic and unpredictable environments. Specifically, data reliability, human machine interaction, security, transparency and explainability challenges need to be addressed at the same time. Recent advances in AI research (e.g., in deep neural networks security and explainable AI (XAI) systems), coupled with novel research outcomes in the formal specification and verification of AI systems provide a sound basis for safe and reliable AI deployments in production lines. Moreover, the legal and regulatory dimension of safe and reliable AI solutions in production lines must be considered as well. To address some of the above listed challenges, fifteen European Organizations collaborate in the scope of the STAR project, a research initiative funded by the European Commission in the scope of its H2020 program (Grant Agreement Number: 956573). STAR researches, develops, and validates novel technologies that enable AI systems to acquire knowledge in order to take timely and safe decisions in dynamic and unpredictable environments. Moreover, the project researches and delivers approaches that enable AI systems to confront sophisticated adversaries and to remain robust against security attacks. This book is co-authored by the STAR consortium members and provides a review of technologies, techniques and systems for trusted, ethical, and secure AI in manufacturing. The different chapters of the book cover systems and technologies for industrial data reliability, responsible and transparent artificial intelligence systems, human centered manufacturing systems such as human-centred digital twins, cyber-defence in AI systems, simulated reality systems, human robot collaboration systems, as well as automated mobile robots for manufacturing environments. A variety of cutting-edge AI technologies are employed by these systems including deep neural networks, reinforcement learning systems, and explainable artificial intelligence systems. Furthermore, relevant standards and applicable regulations are discussed. Beyond reviewing state of the art standards and technologies, the book illustrates how the STAR research goes beyond the state of the art, towards enabling and showcasing human-centred technologies in production lines. Emphasis is put on dynamic human in the loop scenarios, where ethical, transparent, and trusted AI systems co-exist with human workers. The book is made available as an open access publication, which could make it broadly and freely available to the AI and smart manufacturing communities
    corecore