504 research outputs found

    Analysis and optimization of VLSI Clock Distribution Networks for skew variability reduction

    Get PDF
    As VLSI technology moves into the Ultra-Deep Sub-Micron (UDSM) era, manufacturing variations, power supply noise and temperature variations greatly affect the performance and yield of VLSI circuits. Clock Distribution Network (CDN), which is one of the biggest and most important nets in any synchronous VLSI chip, is especially sensitive to these variations. To address this problem variability-aware analysis and optimization techniques for VLSI circuits are needed. In the first part of this thesis an analytical bound for the unwanted skew due to interconnect variation is established. Experimental results show that this bound is safer, tighter and computationally faster than existing approaches. This bound could be used in variation-aware clock tree synthesis.The second part of the thesis deals with optimizing a given clock tree to minimize the unwanted skew variations. Non-tree CDNs have been recognized as a promising approach to overcome the variation problem. We propose a novel non-tree CDN obtained by adding cross links in an existing clock tree. We analyze the effect of the link insertion on clock skew variability and propose link insertion schemes. The non-tree CDNs so obtained are shown to be highly tolerant to skew variability with very little increase in total wire-length. This can be used in applications such as ASIC design where a significant increase in the total wire-length is unacceptable

    CROSS-LAYER DESIGN, OPTIMIZATION AND PROTOTYPING OF NoCs FOR THE NEXT GENERATION OF HOMOGENEOUS MANY-CORE SYSTEMS

    Get PDF
    This thesis provides a whole set of design methods to enable and manage the runtime heterogeneity of features-rich industry-ready Tile-Based Networkon- Chips at different abstraction layers (Architecture Design, Network Assembling, Testing of NoC, Runtime Operation). The key idea is to maintain the functionalities of the original layers, and to improve the performance of architectures by allowing, joint optimization and layer coordinations. In general purpose systems, we address the microarchitectural challenges by codesigning and co-optimizing feature-rich architectures. In application-specific NoCs, we emphasize the event notification, so that the platform is continuously under control. At the network assembly level, this thesis proposes a Hold Time Robustness technique, to tackle the hold time issue in synchronous NoCs. At the network architectural level, the choice of a suitable synchronization paradigm requires a boost of synthesis flow as well as the coexistence with the DVFS. On one hand this implies the coexistence of mesochronous synchronizers in the network with dual-clock FIFOs at network boundaries. On the other hand, dual-clock FIFOs may be placed across inter-switch links hence removing the need for mesochronous synchronizers. This thesis will study the implications of the above approaches both on the design flow and on the performance and power quality metrics of the network. Once the manycore system is composed together, the issue of testing it arises. This thesis takes on this challenge and engineers various testing infrastructures. At the upper abstraction layer, the thesis addresses the issue of managing the fully operational system and proposes a congestion management technique named HACS. Moreover, some of the ideas of this thesis will undergo an FPGA prototyping. Finally, we provide some features for emerging technology by characterizing the power consumption of Optical NoC Interfaces

    Addressing Manufacturing Challenges in NoC-based ULSI Designs

    Full text link
    Hernández Luz, C. (2012). Addressing Manufacturing Challenges in NoC-based ULSI Designs [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1669

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Sincronização em sistemas integrados a alta velocidade

    Get PDF
    Doutoramento em Engenharia ElectrotécnicaA distribui ção de um sinal relógio, com elevada precisão espacial (baixo skew) e temporal (baixo jitter ), em sistemas sí ncronos de alta velocidade tem-se revelado uma tarefa cada vez mais demorada e complexa devido ao escalonamento da tecnologia. Com a diminuição das dimensões dos dispositivos e a integração crescente de mais funcionalidades nos Circuitos Integrados (CIs), a precisão associada as transições do sinal de relógio tem sido cada vez mais afectada por varia ções de processo, tensão e temperatura. Esta tese aborda o problema da incerteza de rel ogio em CIs de alta velocidade, com o objetivo de determinar os limites do paradigma de desenho sí ncrono. Na prossecu ção deste objectivo principal, esta tese propõe quatro novos modelos de incerteza com âmbitos de aplicação diferentes. O primeiro modelo permite estimar a incerteza introduzida por um inversor est atico CMOS, com base em parâmetros simples e su cientemente gen éricos para que possa ser usado na previsão das limitações temporais de circuitos mais complexos, mesmo na fase inicial do projeto. O segundo modelo, permite estimar a incerteza em repetidores com liga ções RC e assim otimizar o dimensionamento da rede de distribui ção de relógio, com baixo esfor ço computacional. O terceiro modelo permite estimar a acumula ção de incerteza em cascatas de repetidores. Uma vez que este modelo tem em considera ção a correla ção entre fontes de ruí do, e especialmente util para promover t ecnicas de distribui ção de rel ogio e de alimentação que possam minimizar a acumulação de incerteza. O quarto modelo permite estimar a incerteza temporal em sistemas com m ultiplos dom ínios de sincronismo. Este modelo pode ser facilmente incorporado numa ferramenta autom atica para determinar a melhor topologia para uma determinada aplicação ou para avaliar a tolerância do sistema ao ru ído de alimentação. Finalmente, usando os modelos propostos, são discutidas as tendências da precisão de rel ogio. Conclui-se que os limites da precisão do rel ogio são, em ultima an alise, impostos por fontes de varia ção dinâmica que se preveem crescentes na actual l ogica de escalonamento dos dispositivos. Assim sendo, esta tese defende a procura de solu ções em outros ní veis de abstração, que não apenas o ní vel f sico, que possam contribuir para o aumento de desempenho dos CIs e que tenham um menor impacto nos pressupostos do paradigma de desenho sí ncrono.Distributing a the clock simultaneously everywhere (low skew) and periodically everywhere (low jitter) in high-performance Integrated Circuits (ICs) has become an increasingly di cult and time-consuming task, due to technology scaling. As transistor dimensions shrink and more functionality is packed into an IC, clock precision becomes increasingly a ected by Process, Voltage and Temperature (PVT) variations. This thesis addresses the problem of clock uncertainty in high-performance ICs, in order to determine the limits of the synchronous design paradigm. In pursuit of this main goal, this thesis proposes four new uncertainty models, with di erent underlying principles and scopes. The rst model targets uncertainty in static CMOS inverters. The main advantage of this model is that it depends only on parameters that can easily be obtained. Thus, it can provide information on upcoming constraints very early in the design stage. The second model addresses uncertainty in repeaters with RC interconnects, allowing the designer to optimise the repeater's size and spacing, for a given uncertainty budget, with low computational e ort. The third model, can be used to predict jitter accumulation in cascaded repeaters, like clock trees or delay lines. Because it takes into consideration correlations among variability sources, it can also be useful to promote oorplan-based power and clock distribution design in order to minimise jitter accumulation. A fourth model is proposed to analyse uncertainty in systems with multiple synchronous domains. It can be easily incorporated in an automatic tool to determine the best topology for a given application or to evaluate the system's tolerance to power-supply noise. Finally, using the proposed models, this thesis discusses clock precision trends. Results show that limits in clock precision are ultimately imposed by dynamic uncertainty, which is expected to continue increasing with technology scaling. Therefore, it advocates the search for solutions at other abstraction levels, and not only at the physical level, that may increase system performance with a smaller impact on the assumptions behind the synchronous design paradigm

    Design and Validation of Network-on-Chip Architectures for the Next Generation of Multi-synchronous, Reliable, and Reconfigurable Embedded Systems

    Get PDF
    NETWORK-ON-CHIP (NoC) design is today at a crossroad. On one hand, the design principles to efficiently implement interconnection networks in the resource-constrained on-chip setting have stabilized. On the other hand, the requirements on embedded system design are far from stabilizing. Embedded systems are composed by assembling together heterogeneous components featuring differentiated operating speeds and ad-hoc counter measures must be adopted to bridge frequency domains. Moreover, an unmistakable trend toward enhanced reconfigurability is clearly underway due to the increasing complexity of applications. At the same time, the technology effect is manyfold since it provides unprecedented levels of system integration but it also brings new severe constraints to the forefront: power budget restrictions, overheating concerns, circuit delay and power variability, permanent fault, increased probability of transient faults. Supporting different degrees of reconfigurability and flexibility in the parallel hardware platform cannot be however achieved with the incremental evolution of current design techniques, but requires a disruptive approach and a major increase in complexity. In addition, new reliability challenges cannot be solved by using traditional fault tolerance techniques alone but the reliability approach must be also part of the overall reconfiguration methodology. In this thesis we take on the challenge of engineering a NoC architectures for the next generation systems and we provide design methods able to overcome the conventional way of implementing multi-synchronous, reliable and reconfigurable NoC. Our analysis is not only limited to research novel approaches to the specific challenges of the NoC architecture but we also co-design the solutions in a single integrated framework. Interdependencies between different NoC features are detected ahead of time and we finally avoid the engineering of highly optimized solutions to specific problems that however coexist inefficiently together in the final NoC architecture. To conclude, a silicon implementation by means of a testchip tape-out and a prototype on a FPGA board validate the feasibility and effectivenes

    Méthodologies de conception ASIC pour des systèmes sur puce 3D hétérogènes à base de réseaux sur puce 3D

    Get PDF
    Dans cette thèse, nous étudions les architectures 3D NoC grâce à des implémentations de conception physiques en utilisant la technologie 3D réel mis en oeuvre dans l'industrie. Sur la base des listes d'interconnexions en déroute, nous procédons à l'analyse des performances d'évaluer le bénéfice de l'architecture 3D par rapport à sa mise en oeuvre 2D. Sur la base du flot de conception 3D proposé en se concentrant sur la vérification temporelle tirant parti de l'avantage du retard négligeable de la structure de microbilles pour les connexions verticales, nous avons mené techniques de partitionnement de NoC 3D basé sur l'architecture MPSoC y compris empilement homogène et hétérogène en utilisant Tezzaron 3D IC technlogy. Conception et mise en oeuvre de compromis dans les deux méthodes de partitionnement est étudiée pour avoir un meilleur aperçu sur l'architecture 3D de sorte qu'il peut être exploitée pour des performances optimales. En utilisant l'approche 3D homogène empilage, NoC topologies est explorée afin d'identifier la meilleure topologie entre la topologie 2D et 3D pour la mise en œuvre MPSoC 3D sous l'hypothèse que les chemins critiques est fondée sur les liens inter-routeur. Les explorations architecturales ont également examiné les différentes technologies de traitement. mettant en évidence l'effet de la technologie des procédés à la performance d'architecture 3D en particulier pour l'interconnexion dominant du design. En outre, nous avons effectué hétérogène 3D d'empilage pour la mise en oeuvre MPSoC avec l'approche GALS de style et présenté plusieurs analyses de conception physiques connexes concernant la conception 3D et la mise en œuvre MPSoC utilisant des outils de CAO 2D. Une analyse plus approfondie de l'effet microbilles pas à la performance de l'architecture 3D à l'aide face-à-face d'empilement est également signalé l'identification des problèmes et des limitations à prendre en considération pendant le processus de conception.In this thesis, we study the exploration 3D NoC architectures through physical design implementations using real 3D technology used in the industry. Based on the proposed 3D design flow focusing on timing verification by leveraging the benefit of negligible delay of microbumps structure for vertical connections, we have conducted partitioning techniques for 3D NoC-based MPSoC architecture including homogeneous and heterogeneous stacking using Tezzaron 3D IC technlogy. Design and implementation trade-off in both partitioning methods is investigated to have better insight about 3D architecture so that it can be exploited for optimal performance. Using homogeneous 3D stacking approach, NoC architectures are explored to identify the best topology between 2D and 3D topology for 3D MPSoC implementation. The architectural explorations have also considered different process technologies highlighting the wire delay effect to the 3D architecture performance especially for interconnect-dominated design. Additionally, we performed heterogeneous 3D stacking of NoC-based MPSoC implementation with GALS style approach and presented several physical designs related analyses regarding 3D MPSoC design and implementation using 2D EDA tools. Finally we conducted an exploration of 2D EDA tool on different 3D architecture to evaluate the impact of 2D EDA tools on the 3D architecture performance. Since there is no commercialize 3D design tool until now, the experiment is important on the basis that designing 3D architecture using 2D EDA tools does not have a strong and direct impact to the 3D architecture performance mainly because the tools is dedicated for 2D architecture design.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Circuit design and analysis for on-FPGA communication systems

    No full text
    On-chip communication system has emerged as a prominently important subject in Very-Large- Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects. Interconnects often dictates the system performance, and, therefore, research for new methodologies and system architectures that deliver high-performance communication services across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication. Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable fabrics, switches and the specific routing architecture also introduce additional latency and bandwidth degradation further hindering intra-chip communication performance. Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs. Communication with programmable interconnect received little attention and is inadequately understood. This thesis is among the first to research on-chip communication systems that are built on top of programmable fabrics and proposes methodologies to maximize the interconnect throughput performance. There are three major contributions in this thesis: (i) an analysis of on-chip interconnect fringing, which degrades the bandwidth of communication channels due to routing congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly improves the interconnect throughput by exploiting the fundamental electrical characteristics of the reconfigurable interconnect structures. This new scheme can potentially mitigate the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime optimization for route planning and dynamic routing which, effectively utilizes the in-silicon bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new methodologies and concepts are proposed to enhance the on-FPGA communication throughput performance that is of vital importance in new technology processes

    Exploration and Design of Power-Efficient Networked Many-Core Systems

    Get PDF
    Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

    Design methodologies for variation-aware integrated circuits

    Get PDF
    The scaling of VLSI technology has spurred a rapid growth in the semiconductor industry. With the CMOS device dimension scaling to and beyond 90nm technology, it is possible to achieve higher performance and to pack more complex functionalities on a single chip. However, the scaling trend has introduced drastic variation of process and design parameters, leading to severe variability of chip performance in nanometer regime. Also, the manufacturing community projects CMOS will scale for three to four more generations. Since the uncertainties due to variations are expected to increase in each generation, it will significantly impact the performance of design and consequently the yield. Another challenging issue in the nanometer IC design is the high power consumption due to the greater packing density, higher frequency of operation and excessive leakage power. Moreover, the circuits are usually over-designed to compensate for uncertainties due to variations. The over-designed circuits not only make timing closure difficult but also cause excessive power consumption. For portable electronics, excessive power consumption may reduce battery life; for non-portable systems it may impose great difficulties in cooling and packaging. The objective of my research has been to develop design methodologies to address variations and power dissipation for reliable circuit operation. The proposed work has been divided into three parts: the first part addresses the issues related with power/ground noise induced by clock distribution network and proposes techniques to reduce power/ground noise considering the effects of process variations. The second part proposes an elastic pipeline scheme for random circuits with feedback loops. The proposed scheme provides a low-power solution that has the same variation tolerance as the conventional approaches. The third section deals with discrete buffer and wire sizing for link-based non-tree clock network, which is an energy efficient structure for skew tolerance to variations. For the power/ground noise problem, our approach could reduce the peak current and the delay variations by 50% and 51% respectively. Compared to conventional approach, the elastic timing scheme reduces power dissipation by 20% − 27%. The sizing method achieves clock skew reduction of 45% with a small increase in power dissipation
    corecore