228 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationOver the last decade, cyber-physical systems (CPSs) have seen significant applications in many safety-critical areas, such as autonomous automotive systems, automatic pilot avionics, wireless sensor networks, etc. A Cps uses networked embedded computers to monitor and control physical processes. The motivating example for this dissertation is the use of fault- tolerant routing protocol for a Network-on-Chip (NoC) architecture that connects electronic control units (Ecus) to regulate sensors and actuators in a vehicle. With a network allowing Ecus to communicate with each other, it is possible for them to share processing power to improve performance. In addition, networked Ecus enable flexible mapping to physical processes (e.g., sensors, actuators), which increases resilience to Ecu failures by reassigning physical processes to spare Ecus. For the on-chip routing protocol, the ability to tolerate network faults is important for hardware reconfiguration to maintain the normal operation of a system. Adding a fault-tolerance feature in a routing protocol, however, increases its design complexity, making it prone to many functional problems. Formal verification techniques are therefore needed to verify its correctness. This dissertation proposes a link-fault-tolerant, multiflit wormhole routing algorithm, and its formal modeling and verification using two different methodologies. An improvement upon the previously published fault-tolerant routing algorithm, a link-fault routing algorithm is proposed to relax the unrealistic node-fault assumptions of these algorithms, while avoiding deadlock conservatively by appropriately dropping network packets. This routing algorithm, together with its routing architecture, is then modeled in a process-algebra language LNT, and compositional verification techniques are used to verify its key functional properties. As a comparison, it is modeled using channel-level VHDL which is compiled to labeled Petri-nets (LPNs). Algorithms for a partial order reduction method on LPNs are given. An optimal result is obtained from heuristics that trace back on LPNs to find causally related enabled predecessor transitions. Key observations are made from the comparison between these two verification methodologies

    Formal Analysis of a Fault-Tolerant Routing Algorithm for a Network-on-Chip

    Get PDF
    International audienceA fault-tolerant routing algorithm in Network-on-Chip architectures provides adaptivity for on-chip communications. Adding fault-tolerance adaptivity to a routing algorithm increases its design complexity and makes it prone to deadlock and other problems if improperly implemented. Formal verification techniques are needed to check the correctness of the design. This paper performs formal analysis on an extension of the link-fault tolerant Network-on-Chip architecture introduced by Wu et al. that supports multiflit wormhole routing. This paper describes several lessons learned during the process of constructing a formal model of this routing architecture. Finally, this paper presents how the deadlock freedom and tolerance to a single-link fault is verified for a two-by-two mesh version of this routing architecture

    Composability and Predictability for Independent Application Development, Verification and Execution

    Get PDF
    System-on-chip (SOC) design gets increasingly complex, as a growing number of applications are integrated in modern systems. Some of these applications have real-time requirements, such as a minimum throughput or a maximum latency. To reduce cost, system resources are shared between applications, making their timing behavior inter-dependent. Real-time requirements must hence be verified for all possible combinations of concurrently executing applications, which is not feasible with commonly used simulation-based techniques. This chapter addresses this problem using two complexity-reducing concepts: composability and predictability. Applications in a composable system are completely isolated and cannot affect each other’s behaviors, enabling them to be independently verified. Predictable systems, on the other hand, provide lower bounds on performance, allowing applications to be verified using formal performance analysis. Five techniques to achieve composability and/or predictability in SOC resources are presented and we explain their implementation for processors, interconnect, and memories in our platform

    Exploration and Design of Power-Efficient Networked Many-Core Systems

    Get PDF
    Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

    Spatial parallelism in the routers of asynchronous on-chip networks

    Get PDF
    State-of-the-art multi-processor systems-on-chip use on-chip networks as their communication fabric. Although most on-chip networks are implemented synchronously, asynchronous on-chip networks have several advantages over their synchronous counterparts. Timing division multiplexing (TDM) flow control methods have been utilized in asynchronous on-chip networks extensively. The synchronization required by TDM leads to significant speed penalties. Compared with using TDM methods, spatial parallelism methods, such as the spatial division multiplexing (SDM) flow control method, achieve better network throughput with less area overhead.This thesis proposes several techniques to increase spatial parallelism in the routers of asynchronous on-chip networks.Channel slicing is a new pipeline structure that alleviates the speed penalty by removing the synchronization among bit-level data pipelines. It is also found out that the lookahead pipeline using early evaluated acknowledgement can be used in routers to further improve speed.SDM is a new flow control method proposed for asynchronous on-chip networks. It improves network throughput without introducing synchronization among buffers of different frames, which is required by TDM methods. It is also found that the area overhead of SDM is smaller than the virtual channel (VC) flow control method -- the most used TDM method. The major design problem of SDM is the area consuming crossbars. A novel 2-stage Clos switch structure is proposed to replace the crossbar in SDM routers, which significantly reduces the area overhead. This Clos switch is dynamically reconfigured by a new asynchronous Clos scheduler.Several asynchronous SDM routers are implemented using these new techniques. An asynchronous VC router is also reproduced for comparison. Performance analyses show that the SDM routers outperform the VC router in throughput, area overhead and energy efficiency.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Spatial parallelism in the routers of asynchronous on-chip networks

    Get PDF
    State-of-the-art multi-processor systems-on-chip use on-chip networks as their communication fabric. Although most on-chip networks are implemented synchronously, asynchronous on-chip networks have several advantages over their synchronous counterparts. Timing division multiplexing (TDM) flow control methods have been utilized in asynchronous on-chip networks extensively. The synchronization required by TDM leads to significant speed penalties. Compared with using TDM methods, spatial parallelism methods, such as the spatial division multiplexing (SDM) flow control method, achieve better network throughput with less area overhead.This thesis proposes several techniques to increase spatial parallelism in the routers of asynchronous on-chip networks.Channel slicing is a new pipeline structure that alleviates the speed penalty by removing the synchronization among bit-level data pipelines. It is also found out that the lookahead pipeline using early evaluated acknowledgement can be used in routers to further improve speed.SDM is a new flow control method proposed for asynchronous on-chip networks. It improves network throughput without introducing synchronization among buffers of different frames, which is required by TDM methods. It is also found that the area overhead of SDM is smaller than the virtual channel (VC) flow control method -- the most used TDM method. The major design problem of SDM is the area consuming crossbars. A novel 2-stage Clos switch structure is proposed to replace the crossbar in SDM routers, which significantly reduces the area overhead. This Clos switch is dynamically reconfigured by a new asynchronous Clos scheduler.Several asynchronous SDM routers are implemented using these new techniques. An asynchronous VC router is also reproduced for comparison. Performance analyses show that the SDM routers outperform the VC router in throughput, area overhead and energy efficiency.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Design of variation-tolerant synchronizers for multiple clock and voltage domains

    Get PDF
    PhD ThesisParametric variability increasingly affects the performance of electronic circuits as the fabrication technology has reached the level of 32nm and beyond. These parameters may include transistor Process parameters (such as threshold voltage), supply Voltage and Temperature (PVT), all of which could have a significant impact on the speed and power consumption of the circuit, particularly if the variations exceed the design margins. As systems are designed with more asynchronous protocols, there is a need for highly robust synchronizers and arbiters. These components are often used as interfaces between communication links of different timing domains as well as sampling devices for asynchronous inputs coming from external components. These applications have created a need for new robust designs of synchronizers and arbiters that can tolerate process, voltage and temperature variations. The aim of this study was to investigate how synchronizers and arbiters should be designed to tolerate parametric variations. All investigations focused mainly on circuit-level and transistor level designs and were modeled and simulated in the UMC90nm CMOS technology process. Analog simulations were used to measure timing parameters and power consumption along with a “Monte Carlo” statistical analysis to account for process variations. Two main components of synchronizers and arbiters were primarily investigated: flip-flop and mutual-exclusion element (MUTEX). Both components can violate the input timing conditions, setup and hold window times, which could cause metastability inside their bistable elements and possibly end in failures. The mean-time between failures is an important reliability feature of any synchronizer delay through the synchronizer. The MUTEX study focused on the classical circuit, in addition to a number of tolerance, based on increasing internal gain by adding current sources, reducing the capacitive loading, boosting the transconductance of the latch, compensating the existing Miller capacitance, and adding asymmetry to maneuver the metastable point. The results showed that some circuits had little or almost no improvements, while five techniques showed significant improvements by reducing τ and maintaining high tolerance. Three design approaches are proposed to provide variation-tolerant synchronizers. wagging synchronizer proposed to First, the is significantly increase reliability over that of the conventional two flip-flop synchronizer. The robustness of the wagging technique can be enhanced by using robust τ latches or adding one more cycle of synchronization. The second approach is the Metastability Auto-Detection and Correction (MADAC) latch which relies on swiftly detecting a metastable event and correcting it by enforcing the previously stored logic value. This technique significantly reduces the resolution time down from uncertain synchronization technique is proposed to transfer signals between Multiple- Voltage Multiple-Clock Domains (MVD/MCD) that do not require conventional level-shifters between the domains or multiple power supplies within each domain. This interface circuit uses a synchronous set and feedback reset protocol which provides level-shifting and synchronization of all signals between the domains, from a wide range of voltage-supplies and clock frequencies. Overall, synchronizer circuits can tolerate variations to a greater extent by employing the wagging technique or using a MADAC latch, while MUTEX tolerance can suffice with small circuit modifications. Communication between MVD/MCD can be achieved by an asynchronous handshake without a need for adding level-shifters.The Saudi Arabian Embassy in London, Umm Al-Qura University, Saudi Arabi
    corecore