11 research outputs found

    Developing Globally-Asynchronous Locally- Synchronous Systems through the IOPT-Flow Framework

    Get PDF
    Throughout the years, synchronous circuits have increased in size and com-plexity, consequently, distributing a global clock signal has become a laborious task. Globally-Asynchronous Locally-Synchronous (GALS) systems emerge as a possible solution; however, these new systems require new tools. The DS-Pnet language formalism and the IOPT-Flow framework aim to support and accelerate the development of cyber-physical systems. To do so it offers a tool chain that comprises a graphical editor, a simulator and code gener-ation tools capable of generating C, JavaScript and VHDL code. However, DS-Pnets and IOPT-Flow are not yet tuned to handle GALS systems, allowing for partial specification, but not a complete one. This dissertation proposes extensions to the DS-Pnet language and the IOPT-Flow framework in order to allow development of GALS systems. Addi-tionally, some asynchronous components were created, these form interfaces that allow synchronous blocks within a GALS system to communicate with each other

    Design of robust asynchronous reconfigurable controllers for parallel synchronization using embedded graphs

    Get PDF
    PhD Thesis: This is a revised version received 24/5/16. The definitive version is the print copy in the Research Reserve Collection of the University LibrarySynchronization is a key System-on-Chip (SoC) design issue in modern technologies. As the number of operating points under consideration increases, specifications which are capable of altering key parameters such as the time available for synchronization and Mean Time Between Failures (MTBF) in response to input from the user/system become desirable. This thesis explores how a combination of parallelism and scheduling, referred to as wagging, can be utilized to construct schedulers for synchronizer designs which are capable of pooling the gain-bandwidth products of their composite devices, in order to satisfy this requirement. In this work, we explore the ways in which the areas of graph theory and reconfigurable hardware design can be applied to generate both combinational and sequential scheduler designs, which satisfy the behavior requirement above. Further to this point, this work illustrates that such a scheduler is primarily comprised of an interrupt subsystem, and a reconfigurable token ring. This thesis explores how both of these components can be controlled in absence of a clock signal, as well as the design challenges inherent to each part. The final noteworthy issue in this study is with regard to the flow control of data in a parallel synchronizer that incorporates a First-In First-Out (FIFO) buffer to decouple the reading and writing operations from each other. Such a structure incurs penalties if the data rates on both sides are not well matched. This work presents a method by which combinations of serial and parallel reading operations are used to minimize this mismatch

    Hazard-free clock synchronization

    Get PDF
    The growing complexity of microprocessors makes it infeasible to distribute a single clock source over the whole processor with a small clock skew. Hence, chips are split into multiple clock regions, each covered by a single clock source. This poses a problem for communication between these clock regions. Clock synchronization algorithms promise an advantage over state-of-the-art solutions, such as GALS systems. When clock regions are synchronous the communication latency improves significantly over handshake-based solutions. We focus on the implementation of clock synchronization algorithms. A major obstacle when implementing circuits on clock domain crossings are hazardous signals. We can formally define hazards by extending the Boolean logic by a third value u. In this thesis, we describe a theory for designing and analyzing hazard-free circuits. We develop strategies for hazard-free encoding and construction of hazard-free circuits from finite state machines. Furthermore, we discuss clock synchronization algorithms and a possible combination of them. In the end, we present two implementations of the GCS algorithm by Lenzen, Locher, and Wattenhofer (JACM 2010). We prove by rigorous analysis that the systems implement the algorithm. The theory described above is used to prove that our clock synchronization circuits are hazard-free (in the sense that they compute the most precise output possible). Simulation of our GCS system shows that it achieves a skew between neighboring clock regions that is smaller than a few inverter delays.Aufgrund der zunehmenden Komplexität von Mikroprozessoren ist es unmöglich, mit einer einzigen Taktquelle den gesamten Prozessor ohne großen Versatz zu takten. Daher werden Chips in mehrere Regionen aufgeteilt, die jeweils von einer einzelnen Taktquelle abgedeckt werden. Dies stellt ein Problem für die Kommunikation zwischen diesen Taktregionen dar. Algorithmen zur Taktsynchronisation bieten einen Vorteil gegenüber aktuellen Lösungen, wie z.B. GALS-Systemen. Synchronisiert man die Taktregionen, so verbessert sich die Latenz der Kommunikation erheblich. In Schaltkreisen zwischen zwei Taktregionen können undefinierte Signale, sogenannte Hazards auftreten. Indem wir die boolesche Algebra um einen dritten Wert u erweitern, können wir diese Hazards formal definieren. In dieser Arbeit zeigen wir eine Methode zum Entwurf und zur Analyse von hazard-freien Schaltungen. Wir entwickeln Strategien für Kodierungen die Hazards vermeiden und zur Konstruktion von hazard-freien Schaltungen. Darüber hinaus stellen wir Algorithmen Taktsynchronisation vor und wie diese kombiniert werden können. Zum Schluss stellen wir zwei Implementierungen des GCS-Algorithmus von Lenzen, Locher und Wattenhofer (JACM 2010) vor. Oben genannte Mechanismen werden verwendet, um formal zu beweisen, dass diese Implementierungen korrekt sind. Die Implementierung hat keine Hazards, das heißt sie berechnet die bestmo ̈gliche Ausgabe. Anschließende Simulation der GCS Implementierung erzielt einen Versatz zwischen benachbarten Taktregionen, der kleiner als ein paar Gatter-Laufzeiten ist

    Spatial parallelism in the routers of asynchronous on-chip networks

    Get PDF
    State-of-the-art multi-processor systems-on-chip use on-chip networks as their communication fabric. Although most on-chip networks are implemented synchronously, asynchronous on-chip networks have several advantages over their synchronous counterparts. Timing division multiplexing (TDM) flow control methods have been utilized in asynchronous on-chip networks extensively. The synchronization required by TDM leads to significant speed penalties. Compared with using TDM methods, spatial parallelism methods, such as the spatial division multiplexing (SDM) flow control method, achieve better network throughput with less area overhead.This thesis proposes several techniques to increase spatial parallelism in the routers of asynchronous on-chip networks.Channel slicing is a new pipeline structure that alleviates the speed penalty by removing the synchronization among bit-level data pipelines. It is also found out that the lookahead pipeline using early evaluated acknowledgement can be used in routers to further improve speed.SDM is a new flow control method proposed for asynchronous on-chip networks. It improves network throughput without introducing synchronization among buffers of different frames, which is required by TDM methods. It is also found that the area overhead of SDM is smaller than the virtual channel (VC) flow control method -- the most used TDM method. The major design problem of SDM is the area consuming crossbars. A novel 2-stage Clos switch structure is proposed to replace the crossbar in SDM routers, which significantly reduces the area overhead. This Clos switch is dynamically reconfigured by a new asynchronous Clos scheduler.Several asynchronous SDM routers are implemented using these new techniques. An asynchronous VC router is also reproduced for comparison. Performance analyses show that the SDM routers outperform the VC router in throughput, area overhead and energy efficiency.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Deterministic execution of synchronous programs in an asynchronous environment. A compositional necessary and sufficient condition

    Get PDF
    Synchronous reactive formalisms form an appealing programming model for embedded system and Systems-on-Chip (SoC) design. Deploying synchronous programs onto asynchronous distributed execution platforms is an important issue, and has been the topic of substantial research in the past. The point is that signal/event absence in a reaction cannot be taken as granted because of communication latencies. A simple solution consists in systematically sending signal absence notifications, but it is unduly expensive at run-time. Sufficient properties have been proposed defining subsets of synchronous programs where asynchronous evaluation is faithful to their original specification. In essence they aim at preserving stream computation {\em monotonicity}, in the original formulation of Kahn Network principles, or {\em confluence}, as coined by R. Milner. Some of these criteria may become quite involved. In the current paper we show a precise technical result: If equivalence between the synchronous and the asynchronous semantics is congruence with respect to parallel constructors, then the "good" criterion amounts to a single step "diamond closure" property, with independent behaviors converging to the union of their effects. It should be remembered here that the {\em local} individual behaviors of components may themselves contain simultaneous events, thereby allowing complex synchronous modeling on this lower layer

    Design and Validation of Network-on-Chip Architectures for the Next Generation of Multi-synchronous, Reliable, and Reconfigurable Embedded Systems

    Get PDF
    NETWORK-ON-CHIP (NoC) design is today at a crossroad. On one hand, the design principles to efficiently implement interconnection networks in the resource-constrained on-chip setting have stabilized. On the other hand, the requirements on embedded system design are far from stabilizing. Embedded systems are composed by assembling together heterogeneous components featuring differentiated operating speeds and ad-hoc counter measures must be adopted to bridge frequency domains. Moreover, an unmistakable trend toward enhanced reconfigurability is clearly underway due to the increasing complexity of applications. At the same time, the technology effect is manyfold since it provides unprecedented levels of system integration but it also brings new severe constraints to the forefront: power budget restrictions, overheating concerns, circuit delay and power variability, permanent fault, increased probability of transient faults. Supporting different degrees of reconfigurability and flexibility in the parallel hardware platform cannot be however achieved with the incremental evolution of current design techniques, but requires a disruptive approach and a major increase in complexity. In addition, new reliability challenges cannot be solved by using traditional fault tolerance techniques alone but the reliability approach must be also part of the overall reconfiguration methodology. In this thesis we take on the challenge of engineering a NoC architectures for the next generation systems and we provide design methods able to overcome the conventional way of implementing multi-synchronous, reliable and reconfigurable NoC. Our analysis is not only limited to research novel approaches to the specific challenges of the NoC architecture but we also co-design the solutions in a single integrated framework. Interdependencies between different NoC features are detected ahead of time and we finally avoid the engineering of highly optimized solutions to specific problems that however coexist inefficiently together in the final NoC architecture. To conclude, a silicon implementation by means of a testchip tape-out and a prototype on a FPGA board validate the feasibility and effectivenes
    corecore