348 research outputs found

    Value-based scheduling in real-time systems

    Get PDF
    A real-time system must execute functionally correct computations in a timely manner. Most of the current real-time systems are static in nature. However in recent years, the growing need for building complex real-time applications coupled with advancements in information technology drives the need for dynamic real-time systems. Dynamic real-time systems need to be designed not only to deal with expected load scenarios, but also to handle overloads by allowing graceful degradation in system performance. Value-based scheduling is a means by which graceful degradation can be achieved by executing critical tasks that offer high values/benefits/rewards to the functioning of the system. This thesis identifies the following two issues in dynamic real-time scheduling: (i) maintaining high system reliability without affecting its schedulability and (ii) providing graceful degradation to the system during overload and maintaining high schedulability during underloads or near full loads. Further, we use value-based scheduling techniques to address these issues. The first contribution of this thesis is a reliability-aware value-based scheduler capable of maintaining high system reliability and schedulability. We use a performance index (PI) based value function for scheduling, which can capture the tradeoff between schedulability and reliability. The proposed scheduler selects a suitable redundancy level for each task so as to increase the performance index of the system. We show through our simulation studies that proposed scheduler maintains a high system value (PI). The second contribution of this thesis is an adaptive value-based scheduler that can change its scheduling behavior from deadline-based scheduling to value-based scheduling based on the system workload, so that it can maintain a high system value with fewer deadline misses. Further, the scheduler is extended to heterogeneous computing (HC) systems, wherein the computing capabilities of processors/machines are different, and propose two adaptive schedulers (Basic and Integrated) for HC systems. The performance of the proposed scheduling algorithms is studied through extensive simulation studies for both homogeneous and heterogeneous computing systems. We have concluded that the proposed adaptive scheduling scheme maintains a high system value with fewer deadlines misses for all range workloads. Amongst the schedulers for HC systems, we conclude that the Basic scheduler, which has a lesser run-time complexity, performs better for most of the workloads. The last contribution of this thesis is the design and implementation of the proposed adaptive value-based scheduler for homogeneous computing systems in a real-time Linux operating system, RT-Linux. We compare the performance of the implementation with EDF and Highest Value-Density First (HVDF) schedulers for various ranges of workloads and show that the proposed scheduler performs better in maintaining a high system value with fewer deadline misses

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    Design and implementation of secured agent based NoC using shortest path routing algorithm

    Get PDF
    Network on chip (NoC) is a scalable interconnection architecture for every increasing communication demand between many processing cores in system on chip design. Reliability aspects are becoming an important issue in fault tolerant architecture. Hence there is a demand for fault tolerant Agent architecture with suitable routing algorithm which plays a vital role in order to enhance the NoC performance. The proposed fault tolerant Agent based NoC method is used to enhance the reliability and performance of the Multiprocessor System on Chip (MPSoC) design against faulty links and nodes. These agents are placed in hierarchical manner to collect, process, classify and distribute different fault information related to the faulty links and nodes of the network. This fault information is used for further packet routing in the network with the help of shortest path routing algorithm. In addition to this the agent will provide the security for the node by setting firewall, which then decides whether the packet has to be processed or not. This intern provides high performance, low latency NoC by avoiding deadlock and live lock with low area overhead

    Design of a fault tolerant airborne digital computer. Volume 1: Architecture

    Get PDF
    This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive

    Microprocessor fault-tolerance via on-the-fly partial reconfiguration

    Get PDF
    This paper presents a novel approach to exploit FPGA dynamic partial reconfiguration to improve the fault tolerance of complex microprocessor-based systems, with no need to statically reserve area to host redundant components. The proposed method not only improves the survivability of the system by allowing the online replacement of defective key parts of the processor, but also provides performance graceful degradation by executing in software the tasks that were executed in hardware before a fault and the subsequent reconfiguration happened. The advantage of the proposed approach is that thanks to a hardware hypervisor, the CPU is totally unaware of the reconfiguration happening in real-time, and there's no dependency on the CPU to perform it. As proof of concept a design using this idea has been developed, using the LEON3 open-source processor, synthesized on a Virtex 4 FPG

    Problems related to the integration of fault tolerant aircraft electronic systems

    Get PDF
    Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

    Checkpoint-based forward recovery using lookahead execution and rollback validation in parallel and distributed systems

    Get PDF
    This thesis studies a forward recovery strategy using checkpointing and optimistic execution in parallel and distributed systems. The approach uses replicated tasks executing on different processors for forwared recovery and checkpoint comparison for error detection. To reduce overall redundancy, this approach employs a lower static redundancy in the common error-free situation to detect error than the standard N Module Redundancy scheme (NMR) does to mask off errors. For the rare occurrence of an error, this approach uses some extra redundancy for recovery. To reduce the run-time recovery overhead, look-ahead processes are used to advance computation speculatively and a rollback process is used to produce a diagnosis for correct look-ahead processes without rollback of the whole system. Both analytical and experimental evaluation have shown that this strategy can provide a nearly error-free execution time even under faults with a lower average redundancy than NMR

    Adaptive Fault Tolerance and Graceful Degradation Under Dynamic Hard Real-time Scheduling

    Get PDF
    Static redundancy allocation is inappropriate in hard realtime systems that operate in variable and dynamic environments, (e.g., radar tracking, avionics). Adaptive Fault Tolerance (AFT) can assure adequate reliability of critical modules, under temporal and resources constraints, by allocating just as much redundancy to less critical modules as can be afforded, thus gracefully reducing their resource requirement. In this paper, we propose a mechanism for supporting adaptive fault tolerance in a real-time system. Adaptation is achieved by choosing a suitable redundancy strategy for a dynamically arriving computation to assure required reliability and to maximize the potential for fault tolerance while ensuring that deadlines are met. The proposed approach is evaluated using a real-life workload simulating radar tracking software in AWACS early warning aircraft. The results demonstrate that our technique outperforms static fault tolerance strategies in terms of tasks meeting their timing constraints. Further, we show that the gain in this timing-centric performance metric does not reduce the fault tolerance of the executing tasks below a predefined minimum level. Overall, the evaluation indicates that the proposed ideas result in a system that dynamically provides QOS guarantees along the fault-tolerance dimension
    corecore