Search CORE

145 research outputs found

Towards Middleware for Fault-tolerance in Distributed Real-time and Embedded Systems

Author: Aniruddha Gokhale
Douglas C Schmidt
Jaiganesh Balasubramanian
Nanbor Wang
Publication venue
Publication date: 05/03/2020
Field of study

Abstract. Distributed real-time and embedded (DRE) systems often require support for multiple simultaneous quality of service (QoS) properties, such as real-timeliness and fault tolerance, that operate within resource constrained environments. These resource constraints motivate the need for a lightweight middleware infrastructure, while the need for simultaneous QoS properties require the middleware to provide fault tolerance capabilities that respect time-critical needs of DRE systems. Conventional middleware solutions, such as Fault-tolerant CORBA (FT-CORBA) and Continuous Availability API for J2EE, have limited utility for DRE systems because they are heavyweight (e.g., the complexity of their feature-rich fault tolerance capabilities consumes excessive runtime resources), yet incomplete (e.g., they lack mechanisms that enable fault tolerance while maintaining real-time predictability). This paper provides three contributions to the development and standardization of lightweight real-time and fault-tolerant middleware for DRE systems. First, we discuss the challenges in realizing real-time faulttolerant solutions for DRE systems using contemporary middleware. Second, we describe recent progress towards standardizing a CORBA lightweight fault-tolerance specification for DRE systems. Third, we present the architecture of FLARe, which is a prototype based on the OMG real-time fault-tolerant CORBA middleware standardization efforts that is lightweight (e.g., leverages only those server-and client-side mechanisms required for real-time systems) and predictable (e.g., provides fault-tolerant mechanisms that respect time-critical performance needs of DRE systems)

CiteSeerX

Resource-Aware Deployment, Configuration, and Adaptation for Fault-tolerant Distributed Real-time Embedded Systems

Author: Balasubramanian Jaiganesh
Publication venue: VANDERBILT
Publication date
Field of study

MDDPro: Model-Driven Dependability Provisioning in Enterprise Distributed Real-Time and Embedded Systems

Author: Aniruddha Gokhale
Jaiganesh Balasubramanian
Sumant Tambe
Thomas Damiano
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2007
Field of study

Abstract Service oriented architecture (SOA) design principles are increasingly being adopted to develop distributed real-time and embedded (DRE) systems, such as avionics mission computing, due to the availability of real-time component middleware platforms. Traditional approaches to fault tolerance that rely on replication and recovery of a single server or a single host do not work in this paradigm since the fault management schemes must now account for the timely and simultaneous failover of groups of entities while improving system availability by minimizing the risk of simultaneous failures of replicated entities. This paper describes MDDPro, a model-driven dependability provisioning tool for DRE systems. MDDPro provides intuitive modeling abstractions to specify failover requirements of DRE systems at different granularities. MDDPro enables plugging in different replica placement algorithms to improve system availability. Finally, its generative capabilities automate the deployment and configuration of the DRE system on the underlying platforms

CiteSeerX

Component-based Fault Tolerance for Distributed Real-Time and Embedded Systems

Author: Wolf Friedhelm
Publication venue: VANDERBILT
Publication date
Field of study

Model-driven Fault-Tolerance Provisioning for Component-based Distributed Real-time Embedded Systems

Author: Tambe Sumant
Publication venue: VANDERBILT
Publication date
Field of study

Principles for Safe and Automated Middleware Specializations for Distributed, Real-time and Embedded Systems

Author: Dabholkar Akshay Vishwas
Publication venue: VANDERBILT
Publication date
Field of study

Replicated execution of workflows

Author: Schäfer David Richard
Publication venue
Publication date: 01/01/2018
Field of study

Workflows are the de facto standard for managing and optimizing business processes. Workflows allow businesses to automate interactions between business locations and partners residing anywhere on the planet. This, however, requires the workflows to be executed in a distributed and dynamic environment, where device and communication failures occur quite frequently. In case that a workflow execution becomes unavailable through such failures, the business operations that rely on the workflow might be hindered or even stopped, implying the loss of money. Consequently, availability is a key concern when using workflows in dynamic environments. In this thesis, we propose replication schemes for workflow engines to ensure the availability of the workflows that are executed by these engines. Of course, a workflow that is executed by a replicated workflow engine has to yield the same result as a non-replicated execution of that workflow. To this end, we formally define the equivalence of a replicated and a non-replicated execution called Single-Execution-Equivalence. Subsequently, we present replication schemes for both imperative and declarative workflow languages. Imperative workflow languages, such as the Web Service Business Process Execution Language (WS-BPEL), specify the execution order of activities through an ordering relation and are the predominant way of specifying workflow models. We implement a proof-of-concept for demonstrating the compatibility of our replication schemes with current (imperative) workflow technology. Declarative workflow languages provide greater flexibility by allowing the reordering of the activities within a workflow at run-time. We exploit this by executing differently ordered replicas on several nodes in the network for improving availability further

Passive Fault-Tolerance Management in Component-Based Embedded Systems

Author: Coelho Jorge
Nogueira Luís
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 01/01/2015
Field of study

It is imperative to accept that failures can and will occur even in meticulously designed distributed systems and to design proper measures to counter those failures. Passive replication minimizes resource consumption by only activating redundant replicas in case of failures, as typically, providing and applying state updates is less resource demanding than requesting execution. However, most existing solutions for passive fault tolerance are usually designed and configured at design time, explicitly and statically identifying the most critical components and their number of replicas, lacking the needed flexibility to handle the runtime dynamics of distributed component-based embedded systems. This paper proposes a cost-effective adaptive fault tolerance solution with a significant lower overhead compared to a strict active redundancy-based approach, achieving a high error coverage with a minimum amount of redundancy. The activation of passive replicas is coordinated through a feedback-based coordination model that reduces the complexity of the needed interactions among components until a new collective global service solution is determined, hence improving the overall maintainability and robustness of the system

Repositório Científico do Instituto Politécnico do Porto

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Real-Time Reliable Middleware for Industrial Internet-of-Things

Author: Wang Chao
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2019
Field of study

This dissertation contributes to the area of adaptive real-time and fault-tolerant systems research, applied to Industrial Internet-of-Things (IIoT) systems. Heterogeneous timing and reliability requirements arising from IIoT applications have posed challenges for IIoT services to efficiently differentiate and meet such requirements. Specifically, IIoT services must both differentiate processing according to applications\u27 timing requirements (including latency, event freshness, and relative consistency of each other) and enforce the needed levels of assurance for data delivery (even as far as ensuring zero data loss). It is nontrivial for an IIoT service to efficiently differentiate such heterogeneous IIoT timing/reliability requirements to fit each application, especially when facing increasingly large data traffic and when common fault-tolerant mechanisms tend to introduce latency and latency jitters. This dissertation presents a new adaptive real-time fault-tolerant framework for IIoT systems, along with efficient and adaptive strategies to meet each IIoT application\u27s timing/reliability requirements. The contributions of the framework are demonstrated by three new IIoT middleware services: (1) Cyber-Physical Event Processing (CPEP), which both differentiates application-specific latency requirements and enforces cyber-physical timing constraints, by prioritizing, sharing, and shedding event processing. (2) Fault-Tolerant Real-Time Messaging (FRAME), which integrates real-time capabilities with a primary-backup replication system, to fit each application\u27s unique timing and loss-tolerance requirements. (3) Adaptive Real-Time Reliable Edge Computing (ARREC), which leverages heterogeneous loss-tolerance requirements and their different temporal laxities, to perform selective and lazy (yet timely) data replication, thus allowing the system to meet needed levels of loss-tolerance while reducing both the latency and bandwidth penalties that are typical of fault-tolerant sub-systems

Washington University St. Louis: Open Scholarship