2,412 research outputs found

    MCFlow: Middleware for Mixed-Criticality Distributed Real-Time Systems

    Get PDF
    Traditional fixed-priority scheduling analysis for periodic/sporadic task sets is based on the assumption that all tasks are equally critical to the correct operation of the system. Therefore, every task has to be schedulable under the scheduling policy, and estimates of tasks\u27 worst case execution times must be conservative in case a task runs longer than is usual. To address the significant under-utilization of a system\u27s resources under normal operating conditions that can arise from these assumptions, several \emph{mixed-criticality scheduling} approaches have been proposed. However, to date there has been no quantitative comparison of system schedulability or run-time overhead for the different approaches. In this dissertation, we present what is to our knowledge the first side-by-side implementation and evaluation of those approaches, for periodic and sporadic mixed-criticality tasks on uniprocessor or distributed systems, under a mixed-criticality scheduling model that is common to all these approaches. To make a fair evaluation of mixed-criticality scheduling, we also address some previously open issues and propose modifications to improve schedulability and correctness of particular approaches. To facilitate the development and evaluation of mixed-criticality applications, we have designed and developed a distributed real-time middleware, called MCFlow, for mixed-criticality end-to-end tasks running on multi-core platforms. The research presented in this dissertation provides the following contributions to the state of the art in real-time middleware: (1) an efficient component model through which dependent subtask graphs can be configured flexibly for execution within a single core, across cores of a common host, or spanning multiple hosts; (2) support for optimizations to inter-component communication to reduce data copying without sacrificing the ability to execute subtasks in parallel; (3) a strict separation of timing and functional concerns so that they can be configured independently; (4) an event dispatching architecture that uses lock free algorithms where possible to reduce memory contention, CPU context switching, and priority inversion; and (5) empirical evaluations of MCFlow itself and of different mixed criticality scheduling approaches both with a single host and end-to-end across multiple hosts. The results of our evaluation show that in terms of basic distributed real-time behavior MCFlow performs comparably to the state of the art TAO real-time object request broker when only one core is used and outperforms TAO when multiple cores are involved. We also identify and categorize different use cases under which different mixed criticality scheduling approaches are preferable

    Trust enforcement through self-adapting cloud workflow orchestration

    Get PDF
    Providing runtime intelligence of a workflow in a highly dynamic cloud execution environment is a challenging task due the continuously changing cloud resources. Guaranteeing a certain level of workflow Quality of Service (QoS) during the execution will require continuous monitoring to detect any performance violation due to resource shortage or even cloud service interruption. Most of orchestration schemes are either configuration, or deployment dependent and they do not cope with dynamically changing environment resources. In this paper, we propose a workflow orchestration, monitoring, and adaptation model that relies on trust evaluation to detect QoS performance degradation and perform an automatic reconfiguration to guarantee QoS of the workflow. The monitoring and adaptation schemes are able to detect and repair different types of real time errors and trigger different adaptation actions including workflow reconfiguration, migration, and resource scaling. We formalize the cloud resource orchestration using state machine that efficiently captures different dynamic properties of the cloud execution environment. In addition, we use validation model checker to validate our model in terms of reachability, liveness, and safety properties. Extensive experimentation is performed using a health monitoring workflow we have developed to handle dataset from Intelligent Monitoring in Intensive Care III (MIMICIII) and deployed over Docker swarm cluster. A set of scenarios were carefully chosen to evaluate workflow monitoring and the different adaptation schemes we have implemented. The results prove that our automated workflow orchestration model is self-adapting, self-configuring, react efficiently to changes and adapt accordingly while supporting high level of Workflow QoS

    Per-Priority Flow Control (Ppfc) Framework For Enhancing Qos In Metro Ethernet

    Get PDF
    Day by day Internet communication and services are experiencing an increase in variety and quantity in their capacity and demand. Thus, making traffic management and quality of service (QoS) approaches for optimization of the Internet become a challenging area of research; meanwhile flow control and congestion control will be considered as significant fundamentals for the traffic control especially on the high speed Metro Ethernet. IEEE had standardized a method (IEEE 802.3x standard), which provides Ethernet Flow Control (EFC) using PAUSE frames as MAC control frames in the data link layer, to enable or disable data frame transmission. With the initiation of Metro Carrier Ethernet, the conventional ON/OFF IEEE 802.3x approach may no longer be sufficient. Therefore, a new architecture and mechanism that offer more flexible and efficient flow and congestion control, as well as better QoS provisioning is now necessary

    Adaptive Quality of Service Control in Distributed Real-Time Embedded Systems

    Get PDF
    An increasing number of distributed real-time embedded systems face the critical challenge of providing Quality of Service (QoS) guarantees in open and unpredictable environments. For example, such systems often need to enforce CPU utilization bounds on multiple processors in order to avoid overload and meet end-to-end dead-lines, even when task execution times deviate significantly from their estimated values or change dynamically at run-time. This dissertation presents an adaptive QoS control framework which includes a set of control design methodologies to provide robust QoS assurance for systems at different scales. To demonstrate its effectiveness, we have applied the framework to the end-to-end CPU utilization control problem for a common class of distributed real-time embedded systems with end-to-end tasks. We formulate the utilization control problem as a constrained multi-input-multi-output control model. We then present a centralized control algorithm for small or medium size systems, and a decentralized control algorithm for large-scale systems. Both algorithms are designed systematically based on model predictive control theory to dynamically enforce desired utilizations. We also introduce novel task allocation algorithms to ensure that the system is controllable and feasible for utilization control. Furthermore, we integrate our control algorithms with fault-tolerance mechanisms as an effective way to develop robust middleware systems, which maintain both system reliability and real-time performance even when the system is in face of malicious external resource contentions and permanent processor failures. Both control analysis and extensive experiments demonstrate that our control algorithms and middleware systems can achieve robust utilization guarantees. The control framework has also been successfully applied to other distributed real-time applications such as end-to-end delay control in real-time image transmission. Our results show that adaptive QoS control middleware is a step towards self-managing, self-healing and self-tuning distributed computing platform

    Multi-resource management in embedded real-time systems

    Get PDF
    This thesis addresses the problem of online multi-resource management in embedded real-time systems. It focuses on three research questions. The first question concentrates on how to design an efficient hierarchical scheduling framework for supporting independent development and analysis of component based systems, to provide temporal isolation between components. The second question investigates how to change the mapping of resources to tasks and components during run-time efficiently and predictably, and how to analyze the latency of such a system mode change in systems comprised of several scalable components. The third question deals with the scheduling and analysis of a set of parallel-tasks with real-time constraints which require simultaneous access to several different resources. For providing temporal isolation we chose a reservation-based approach. We first focused on processor reservations, where timed events play an important role. Common examples are task deadlines, periodic release of tasks, budget replenishment and budget depletion. Efficient timer management is therefore essential. We investigated the overheads in traditional timer management techniques and presented a mechanism called Relative Timed Event Queues (RELTEQ), which provides an expressive set of primitives at a low processor and memory overhead. We then leveraged RELTEQ to create an efficient, modular and extensible design for enhancing a real-time operating system with periodic tasks, polling, idling periodic and deferrable servers, and a two-level fixed-priority Hierarchical Scheduling Framework (HSF). The HSF design provides temporal isolation and supports independent development of components by separating the global and local scheduling, and allowing each server to define a dedicated scheduler. Furthermore, the design addresses the system overheads inherent to an HSF and prevents undesirable interference between components. It limits the interference of inactive servers on the system level by means of wakeup events and a combination of inactive server queues with a stopwatch queue. Our implementation is modular and requires only a few modifications of the underlying operating system. We then investigated scalable components operating in a memory-constrained system. We first showed how to reduce the memory requirements in a streaming multimedia application, based on a particular priority assignment of the different components along the processing chain. Then we investigated adapting the resource provisions to tasks during runtime, referred to as mode changes. We presented a novel mode change protocol called Swift Mode Changes, which relies on Fixed Priority with Deferred preemption Scheduling to reduce the mode change latency bound compared to existing protocols based on Fixed Priority Preemptive Scheduling. We then presented a new partitioned parallel-task scheduling algorithm called Parallel-SRP (PSRP), which generalizes MSRP for multiprocessors, and the corresponding schedulability analysis for the problem of multi-resource scheduling of parallel tasks with real-time constraints. We showed that the algorithm is deadlock-free, derived a maximum bound on blocking, and used this bound as a basis for a schedulability test. We then demonstrated how PSRP can exploit the inherent parallelism of a platform comprised of multiple heterogeneous resources. Finally, we presented Grasp, which is a visualization toolset aiming to provide insight into the behavior of complex real-time systems. Its flexible plugin infrastructure allows for easy extension with custom visualization and analysis techniques for automatic trace verification. Its capabilities include the visualization of hierarchical multiprocessor systems, including partitioned and global multiprocessor scheduling with migrating tasks and jobs, communication between jobs via shared memory and message passing, and hierarchical scheduling in combination with multiprocessor scheduling. For tracing distributed systems with asynchronous local clocks Grasp also supports the synchronization of traces from different processors during the visualization and analysis

    MACHS: Mitigating the Achilles Heel of the Cloud through High Availability and Performance-aware Solutions

    Get PDF
    Cloud computing is continuously growing as a business model for hosting information and communication technology applications. However, many concerns arise regarding the quality of service (QoS) offered by the cloud. One major challenge is the high availability (HA) of cloud-based applications. The key to achieving availability requirements is to develop an approach that is immune to cloud failures while minimizing the service level agreement (SLA) violations. To this end, this thesis addresses the HA of cloud-based applications from different perspectives. First, the thesis proposes a component’s HA-ware scheduler (CHASE) to manage the deployments of carrier-grade cloud applications while maximizing their HA and satisfying the QoS requirements. Second, a Stochastic Petri Net (SPN) model is proposed to capture the stochastic characteristics of cloud services and quantify the expected availability offered by an application deployment. The SPN model is then associated with an extensible policy-driven cloud scoring system that integrates other cloud challenges (i.e. green and cost concerns) with HA objectives. The proposed HA-aware solutions are extended to include a live virtual machine migration model that provides a trade-off between the migration time and the downtime while maintaining HA objective. Furthermore, the thesis proposes a generic input template for cloud simulators, GITS, to facilitate the creation of cloud scenarios while ensuring reusability, simplicity, and portability. Finally, an availability-aware CloudSim extension, ACE, is proposed. ACE extends CloudSim simulator with failure injection, computational paths, repair, failover, load balancing, and other availability-based modules

    Enabling Artificial Intelligence Analytics on The Edge

    Get PDF
    This thesis introduces a novel distributed model for handling in real-time, edge-based video analytics. The novelty of the model relies on decoupling and distributing the services into several decomposed functions, creating virtual function chains (V F C model). The model considers both computational and communication constraints. Theoretical, simulation and experimental results have shown that the V F C model can enable the support of heavy-load services to an edge environment while improving the footprint of the service compared to state-of-the art frameworks. In detail, results on the V F C model have shown that it can reduce the total edge cost, compared with a monolithic and a simple frame distribution models. For experimenting on a real-case scenario, a testbed edge environment has been developed, where the aforementioned models, as well as a general distribution framework (Apache Spark ©), have been deployed. A cloud service has also been considered. Experiments have shown that V F C can outperform all alternative approaches, by reducing operational cost and improving the QoS. Finally, a migration model, a caching model and a QoS monitoring service based on Long-Term-Short-Term models are introduced

    P4-PSFP: P4-Based Per-Stream Filtering and Policing for Time-Sensitive Networking

    Full text link
    Time-Sensitive Networking (TSN) extends Ethernet to enable real-time communication, including the Credit-Based Shaper (CBS) for prioritized scheduling and the Time-Aware Shaper (TAS) for scheduled traffic. Generally, TSN requires streams to be explicitly admitted before being transmitted. To ensure that admitted traffic conforms with the traffic descriptors indicated for admission control, Per-Stream Filtering and Policing (PSFP) has been defined. For credit-based metering, well-known token bucket policers are applied. However, time-based metering requires time-dependent switch behavior and time synchronization with sub-microsecond precision. While TSN-capable switches support various TSN traffic shaping mechanisms, a full implementation of PSFP is still not available. To bridge this gap, we present a P4-based implementation of PSFP on a 100 Gb/s per port hardware switch. We explain the most interesting aspects of the PSFP implementation whose code is available on GitHub. We demonstrate credit-based and time-based policing and synchronization capabilities to validate the functionality and effectiveness of P4-PSFP. The implementation scales up to 35840 streams depending on the stream identification method. P4-PSFP can be used in practice as long as appropriate TSN switches lack this function. Moreover, its implementation may be helpful for other P4-based hardware implementations that require time synchronization
    corecore