24 research outputs found

    Dynamic Variable Stage Pipeline: an Implementation of its Control

    Get PDF
    RR-6918Energy efficient computing is a major concern in all EDA industry, due mainly to cost, reliability and feasibility: temperature and power are themain performance limiters. Dynamic Variable Stages Pipelines allows to improve processors and data-paths throughput while reducing their energy consumptions. When the clock frequency is lowered, more computations can be performed in a combinatorial way. In such case, signals can go farther away than the fixed pipelines boundaries needed to ensure correct behaviour at high-frequency clock. Dynamic Variables Stages Pipelines allow to save dynamic power when stalling clock on bypassed pipeline buffers. This paper copes with the control for such Dynamic Variable Stages Pipelines. The control must ensure correct mode switches from/to Long Pipeline High Frequency and Short Pipeline Low Frequency. We provide an implementation with distributed control, and another one with centralized control, allowing to cope with very long pipelines where physical latencies can cause feasibility issues

    Another glance at Relay Stations in Latency-Insensitive Designs

    Get PDF
    We revisit the formal modeling of relay stations, which are specific connection elements used in the theory of Latency-Insensitive Design of Globally-Asynchronous/Locally-Synchronous systems. Relay stations are in charge of taking into account the physical mandatory latencies, while handling the regulation of signal/data traffic so as to avoid starvation, deadlock and congestion of local IP synchronous computation blocks. Since proposed by Carloni et al, the structure and behaviors of these relay stations have been amply characterized and analysed. But previous works never provided a fully formal and cycle-accurate description of these mechanisms, amenable to formal verification for instance (instead, mainly simulation models were developed). Due to the needed precision of the whole scheme we feel such a formal description might be needed. We describe such an attempt here. On its way, this work also led us to a number of (hopefully insightful) remarks on favorable and disfavorable graph topologies and initialization features, that are also reported here

    Throughput and FIFO Sizing: an Application to Latency-Insensitive Design

    Get PDF
    RR-6919On-chip communications are a key concern for high end designs. Since latency issues cannot be avoided in deep-submicron technologies, design methodologies need to cope with it. In such a case, precise FIFO sizings are of high interest, to find the right trade-off in between area, power and throughput. This paper provides means to size optimally FIFOs while reaching maximum achievable throughput. We apply our algorithms to Latency-Insensitive Designs. Such algorithms can also be used to size FIFOs in other application fields, as for instance Networks-on-Chips. We also revisit the equalization process, which introduces as much latencies as possible in the system while preserving global system throughput. This algorithm point out where it is possible to introduce more stage of pipelines while ensuring the maximum throughput of the system. It allows for instance to postpone execution of IP(s) to limit dynamic power peak. We provide a modified algorithm that globally minimizes the number of such introduced latencies

    LID: Retry Relay Station and Fusion Shell

    Get PDF
    This paper is electronically published in Electonic Notes In Theoretical Computer Science http://dx.doi.org/10.1016/j.entcs.2009.07.026This paper introduces a new variant implementation of Latency-Insensitive Design elements. It optimizes area footprint of so-called Shell-Wrappers being partially fused with their input Relay-Stations. The modified Relay-Station is called a Retry Relay-Station. We show correctness of this implementation and provide comparative results between a regular implementation and our new one on both FPGA and ASIC

    Formal Methods for Schedulings of Latency-Insensitive Designs

    Get PDF
    LID ( Latency-Insensitive Design) theory was invented to deal with SoC timing closure issues, by allowing arbitrary fixed integer latencies on long global wires. Latencies are coped with using a resynchronization protocol that performs dynamic scheduling of data transportation. Functional behaviour is preserved. This dynamic scheduling is implemented using specific synchronous hardware elements: Relay-Stations (RS) and Shell-Wrappers (SW). Our first goal is to provide a formal modeling of RS and SW, that can then be formally verified. As turns out, resulting behaviour is k-periodic, thus amenable to static scheduling. Our second goal is to provide formal hardware modeling here also. It initially performs Throughput Equalization, adding integer latencies wherever possible; residual cases require introduction of Fractional Registers (FRs) at specific locations. Benchmark results are presented, run on our KPassa tool implementation

    Statically scheduled Process Networks

    Get PDF
    Event/Marked Graphs (EG) form a strict subset of Petri Nets. They are fundamental models in Scheduling Theory, mostly because of their absence of alternative behaviors (or conflict-freeness). It was established in the past that, under broad structural conditions, behavior of Timed Event Graphs (TEG) becomes utterly regular (technically speaking: “ultimately k-periodic”). More recently it has been proposed to use this kind of regular schedulings as syntactic types for so-called N-synchronous processes. These types remained essentially user-provided. Elsewhere there have been proposals for adding control in a “light fashion” to TEGs, not as general Petri Nets, but with the addition of Merge/Select nodes switching the data flows. This was much in the spirit of Kahn process networks [8, 9]. But usually the streams of test values governing the switches are left unspecified, which may introduce phenomena of congestion or starvation in the system, as token flow preservation becomes an issue. In the present paper we suggest to restrict the Merge/Select condition streams to (binary) k-periodic patterns as well, and to study their relations with the schedules constructed as before for TEGs, but on the extended model. We call this model Kahn-extended Event Graphs (KEG). The main result is that flow preservation is now checkable (by abstraction into another model of Weighted Marked Graphs, called SDF in the literature). There are many potential applications of KEG models, as for instance in modern Systems-on-Chip (SoC) comprising on-Chip networks. Communication links can then be shared, and the model can represent the (regular) activity schedules of the computing as well as the communicating components, after a strict scheduling has been found. They can also be used as a support to help find the solution

    Kahn-extended Event Graphs

    Get PDF
    Process Networks have long been used as formal Models of Computation in the design of dedicated hardware and software embedded systems and Systems-on-Chip. Choice-less models such as Marked/Event Graphs and their Synchronous Data Flow extensions have been considered to support periodic scheduling analysis. Those models do not hide dependency informations like regular sequential languages: they capture the communication topology through point-to-point channels. Those models are concurrent, formally defined, have a clear semantic but are limited due to static point-to-point channels. Then, further extensions such as Cyclo-Static Data Flow or Boolean-controlled Dataflow (BDF) graphs introduced routing switches, allowing internal choices while preserving conflict-freeness, in the tradition of Kahn Process Networks. We introduce a new model, which we term Kahn-extended Event Graphs (KEG). It can be seen as a specialization of both Cyclo-Static and BDF processes. It consists merely in the addition of Merge/Select routing nodes to former Marked/Event Graphs; but, most importantly, these new nodes are governed by explicit (ultimately periodic) binary-word switching patterns for routing directions. We introduce identities on Merge/Select expressions, and show how they build a full axiomatization for the flow-equivalence between the computation nodes. The transformations carry a strong intuitive meaning, as they correspond to sharing/unsharing the interconnect links. Such interconnect defines each time a precise Network-on-Chip topology, and the switching patterns drive the traffic. One can also compute the buffering space actually required at the various fifo locations. The example of a Sobel edge filter is discussed to illustrate the importance of this model

    Modélisation formelle de systèmes Insensibles à la Latence et ordonnancement.

    Get PDF
    This PhD thesis introduces new results linking the theory of Latency Insensitive, to a well-known sub-class of Petri Nets called Marked Event Graphs and its extension called Synchronous Data Flow. This work is tightly linked with a well-known problem, called Central Repetitive Problem (workshop scheduling...). We introduce the Synchronous Models, Marked Event Graphs, Synchronous Data Flow (SDF) and Latency Insensitive. After, we discuss existing links between the Synchronous Models, Marked Event Graphs, Synchronous Data Flow (SDF) and Latency Insensitive ; we show that the Latency Insensitive model is a special case of the Marked Event Graph model. After, we recall a well-known result : any Marked Event Graph with at least a strongly connected component (and evaluating with the firing rule As Soon As Possible (ASAP)) enjoys an ultimately repetitive behaviour : that is to say that it exists a static schedule. Starting from this result, we build a specific scheduling scheme called Equalization that is altering virtually the communication topology in order to slow-down too fast pathes adding some "registers", while preserving the global performance in throughput of the original system. Finally, we introduce some limited control in the Latency Insensitive Model, with nodes called select and merge where conditions are known and independant of data flows, more precisely datas are directed by ultimately periodic binary words (just like in the case of the static scheduling). We are creating then an abstraction over the SDF model in order to determinate if the instance of the model accepts a schedule where the size of each place in bounded. We can verify then the liveness of the system through simulation if the original system was having at least a strongly connected component. Finally, we conclude and discuss possibilities for future works.Cette thèse présente de nouveaux résultats liant la théorie des systèmes dits insensibles à la latence, à une sous-classe des réseaux de Pétri dénommée Marked Event Graph et son extension dite Synchronous Data Flow. Ces travaux sont intimement associés avec le problème d'ordonnancement général dénommé problème central répétitif. Nous introduisons les modèles synchrones, Marked Event Graphs, Synchronous Data Flow (SDF) et Latency Insensitive. Après, nous discutons des liens existants entre les modèles synchrones, Marked Event Graphs et Latency Insensitive ; nous montrons que le modèle Latency Insensitive est un cas particulier du modèle Marked Event Graph. Nous présentons ensuite une implémentation vérifiée formellement de Latency Insensitive. Après, nous rappelons un résultat connu : tout Marked Event Graph ayant au moins une partie fortement connexe (et s'évaluant avec une règle d'exécution As Soon As Possible (ASAP)) a un comportement ultimement répétitif : c'est à dire qu'il existe un ordonnancement statique. À partir de ce résultat, nous construisons une technique d'ordonnancement particulière dénommée Égalisation qui altère virtuellement la topologie des communications du système afin de ralentir des chemins trop rapides en rajoutant des "registres", tout en conservant les performances en terme de débit du système originel. Enfin, nous introduisons une notion de contrôle limité au modèle Latency Insensitive, avec des noeuds appelés select et merge dont les conditions sont connueset indépendantes des flots de données, plus exactement les conditions d'aiguillage des données sont dirigées par des mots binaires ultimement périodiques (comme dans le cadre de l'ordonnancement statique). Nous effectuons ensuite une abstraction sur le modèle SDF afin de déterminer si le modèle accepte un ordonnancement où la taille de toute place est bornée. Nous pouvons vérifier ensuite la vivacité du système grâce à une simulation, si le modèle originel disposait d'au moins d'une partie fortement connexe. Finalement, nous concluons et discutons des possibilités de travaux futurs

    Modélisation formelle de systèmes insensibles à la latence et ordonnancement

    No full text
    Cette thèse relie la conception dite insensible à la latence (LID) de systèmes sur puce à une modélisation par Marked Event Graphs (MEG) et extensions Synchronous Data Flow (SDF). Ces travaux permettent l utilisation d un ordonnancement général régulier dénommé problème central répétitif. Nous discutons des liens existants entre les modèles MEG, SDF et LID. Nous présentons ensuite une implantation vérifiée formellement de LID en utilisant des composants réactifs synchrones. Ensuite, nous utilisons un résultat classique sur le comportement ultimement répétitif et l ordonnancement statique des MEGs pour définir des techniques d Égalisation visant à ajuster la topologie des communications du système afin de ralentir des chemins trop rapides en rajoutant des latences, tout en conservant le débit du système originel. Enfin, nous introduisons une notion de contrôle limité au modèle LID, avec des nœuds de routage Select et Merge dont les conditions sont connues et indépendantes des flots de données. Ces conditions d aiguillage sont dirigées elles aussi par des mots binaires ultimement périodiques (comme dans l ordonnancement statique précédent). Par abstraction sur le modèle SDF nous montrons la décidabilité des propriétés de safety et liveness sur le modèle étendu. Nous produisons enfin une axiomatique complète pour la transitivité/commutativité des opérateurs Select/Merge.This PhD thesis links together the theory of Latency Insensitive Design (LID) on System on Chips with Marked Event Graphs (MEG) and its extension Synchronous Data Flow (SDF). This work allows using a general regular scheduling called central repetitive problem (workshop scheduling). We discuss existing links between the models MEG, SDF and LID. We present after a formally verified implementation of LID using synchronous reactive components. Then, we use a classical result on the ultimately repetitive behaviour and static scheduling of MEG to define Equalization techniques attempting to adjust the topology of communications of the system in order to slow-down too fast paths adding some latencies, while preserving the global throughput of the original system. Finally, we introduce some limited control in LID, with routing nodes called Select and Merge where routing conditions are known and independent of data flows. Those routing conditions are directed by ultimately periodic binary words (just like in previous static scheduling). Through an abstraction over SDF model we show the decidability of safety and liveness properties on the extended model. We produce finally a complete axiomatic for transitivity/commutativity of Select/Merge operators.NICE-BU Sciences (060882101) / SudocSudocFranceF

    FMGALS 2007 Compositionality of Statically Scheduled IP Abstract

    No full text
    Timing Closure in presence of long global wire interconnects is one of the main current issues in System-on-Chip design. One proposed solution to the Timing Closure problem is Latency-Insensitive Design (LID) [5,7]. It was noticed in [7] that, in many cases, the dynamically scheduled synchronisations introduced by latency-insensitive protocols could be computed off-line as a static periodic schedule. We showed in [2,3] how this schedule could then be used to further optimize the protocol resources when they are found redundant. The purpose of the present paper is to study how the larger blocks, obtained as synchronous components interconnected by LID protocols optimized by static schedule informations, can be again made to operate with an environment that provides also I/O connections at its own (synchronous or GALS) rate. We also consider the case of multirate SoC, using results from SDF (Synchronous DataFlow) theory [12]
    corecore