201 research outputs found

    Study Of The Relationship Between Delta Delay And Adjacent Parallel Wire Length In 45 Nanometer Process Technology

    Get PDF
    Hierarchical design spans the complete framework of a design flow from Register Transfer Level (RTL), synthesis, place and route, timing closure and various other analyses before sign-off. Finer geometries and increasing interconnect density however have resulted signal integrity becoming the key issue for Deep Sub-Micron design. Post silicon bug due to noise and signal integrity can be prevented and fixed at early stage of the IC design cycle. The purpose of this research is to establish a preventive measurement for adjacent wire that can travel in parallel for 45nm technology. The intention is to ensure that a complex design can be delivered to the market with accurate, fast and trusted analysis and provide sign-off solution. Main approach is to conduct the relationship study between delta delay and adjacent parallel wire in 45 nanometer (nm) process technology and provide a preventive measurement to limit the adjacent wire can travel in parallel. The design is explored thoroughly to study the relationship between delay noise and adjacent parallel wire. The correlation is translated into an equation to estimate the delay noise produced with a certain length of adjacent parallel wire

    SIGNAL PROCESSING TECHNIQUES AND APPLICATIONS

    Get PDF
    As the technologies scaling down, more transistors can be fabricated into the same area, which enables the integration of many components into the same substrate, referred to as system-on-chip (SoC). The components on SoC are connected by on-chip global interconnects. It has been shown in the recent International Technology Roadmap of Semiconductors (ITRS) that when scaling down, gate delay decreases, but global interconnect delay increases due to crosstalk. The interconnect delay has become a bottleneck of the overall system performance. Many techniques have been proposed to address crosstalk, such as shielding, buffer insertion, and crosstalk avoidance codes (CACs). The CAC is a promising technique due to its good crosstalk reduction, less power consumption and lower area. In this dissertation, I will present analytical delay models for on-chip interconnects with improved accuracy. This enables us to have a more accurate control of delays for transition patterns and lead to a more efficient CAC, whose worst-case delay is 30-40% smaller than the best of previously proposed CACs. As the clock frequency approaches multi-gigahertz, the parasitic inductance of on-chip interconnects has become significant and its detrimental effects, including increased delay, voltage overshoots and undershoots, and increased crosstalk noise, cannot be ignored. We introduce new CACs to address both capacitive and inductive couplings simultaneously.Quantum computers are more powerful in solving some NP problems than the classical computers. However, quantum computers suffer greatly from unwanted interactions with environment. Quantum error correction codes (QECCs) are needed to protect quantum information against noise and decoherence. Given their good error-correcting performance, it is desirable to adapt existing iterative decoding algorithms of LDPC codes to obtain LDPC-based QECCs. Several QECCs based on nonbinary LDPC codes have been proposed with a much better error-correcting performance than existing quantum codes over a qubit channel. In this dissertation, I will present stabilizer codes based on nonbinary QC-LDPC codes for qubit channels. The results will confirm the observation that QECCs based on nonbinary LDPC codes appear to achieve better performance than QECCs based on binary LDPC codes.As the technologies scaling down further to nanoscale, CMOS devices suffer greatly from the quantum mechanical effects. Some emerging nano devices, such as resonant tunneling diodes (RTDs), quantum cellular automata (QCA), and single electron transistors (SETs), have no such issues and are promising candidates to replace the traditional CMOS devices. Threshold gate, which can implement complex Boolean functions within a single gate, can be easily realized with these devices. Several applications dealing with real-valued signals have already been realized using nanotechnology based threshold gates. Unfortunately, the applications using finite fields, such as error correcting coding and cryptography, have not been realized using nanotechnology. The main obstacle is that they require a great number of exclusive-ORs (XORs), which cannot be realized in a single threshold gate. Besides, the fan-in of a threshold gate in RTD nanotechnology needs to be bounded for both reliability and performance purpose. In this dissertation, I will present a majority-class threshold architecture of XORs with bounded fan-in, and compare it with a Boolean-class architecture. I will show an application of the proposed XORs for the finite field multiplications. The analysis results will show that the majority class outperforms the Boolean class architectures in terms of hardware complexity and latency. I will also introduce a sort-and-search algorithm, which can be used for implementations of any symmetric functions. Since XOR is a special symmetric function, it can be implemented via the sort-and-search algorithm. To leverage the power of multi-input threshold functions, I generalize the previously proposed sort-and-search algorithm from a fan-in of two to arbitrary fan-ins, and propose an architecture of multi-input XORs with bounded fan-ins

    Design and modelling of variability tolerant on-chip communication structures for future high performance system on chip designs

    Get PDF
    The incessant technology scaling has enabled the integration of functionally complex System-on-Chip (SoC) designs with a large number of heterogeneous systems on a single chip. The processing elements on these chips are integrated through on-chip communication structures which provide the infrastructure necessary for the exchange of data and control signals, while meeting the strenuous physical and design constraints. The use of vast amounts of on chip communications will be central to future designs where variability is an inherent characteristic. For this reason, in this thesis we investigate the performance and variability tolerance of typical on-chip communication structures. Understanding of the relationship between variability and communication is paramount for the designers; i.e. to devise new methods and techniques for designing performance and power efficient communication circuits in the forefront of challenges presented by deep sub-micron (DSM) technologies. The initial part of this work investigates the impact of device variability due to Random Dopant Fluctuations (RDF) on the timing characteristics of basic communication elements. The characterization data so obtained can be used to estimate the performance and failure probability of simple links through the methodology proposed in this work. For the Statistical Static Timing Analysis (SSTA) of larger circuits, a method for accurate estimation of the probability density functions of different circuit parameters is proposed. Moreover, its significance on pipelined circuits is highlighted. Power and area are one of the most important design metrics for any integrated circuit (IC) design. This thesis emphasises the consideration of communication reliability while optimizing for power and area. A methodology has been proposed for the simultaneous optimization of performance, area, power and delay variability for a repeater inserted interconnect. Similarly for multi-bit parallel links, bandwidth driven optimizations have also been performed. Power and area efficient semi-serial links, less vulnerable to delay variations than the corresponding fully parallel links are introduced. Furthermore, due to technology scaling, the coupling noise between the link lines has become an important issue. With ever decreasing supply voltages, and the corresponding reduction in noise margins, severe challenges are introduced for performing timing verification in the presence of variability. For this reason an accurate model for crosstalk noise in an interconnection as a function of time and skew is introduced in this work. This model can be used for the identification of skew condition that gives maximum delay noise, and also for efficient design verification

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems

    Early Wire Characterization for Predictable Network-on-Chip Global Interconnects

    Get PDF
    This work envisions a common design methodology, applicable for every interconnect level and based on early wire characterization, to provide a faster convergence to a feasible and robust design. We claim that such a novel design methodology is vital for upcoming nanometer technologies, where increased variations in both device characteristics and interconnect parameters introduce tedious design closure problems. The proposed methodology has been successfully applied to the wire synthesis of a Network-on-Chip interconnect to: (i) achieve a given delay and noise goals, and (ii) attain a more power-efficient design with respect to existing techniques

    Early Estimation Of The Impact Of Delay Due To Coupling Capacitance In VSLI Circuits

    Get PDF
    University of Minnesota M.S.E.E. thesis.May 2019. Major: Electrical/Computer Engineering. Advisor: Sachin Sapatnekar. 1 computer file (PDF); vii, 54 pages.Coupling capacitance is becoming increasingly problematic at the more advanced technology nodes and affects the timing and sign-off timeline of integrated circuits (ICs). As the coupling capacitance information is only available after the detailed routing phase, it can be a difficult task to make any major changes post detailed routing towards fixing issues caused by coupling effects that were unaccounted for. The goal of the project is to come up with an estimate of coupling capacitance for a given net before the detailed routing phase with the help of congestion maps. This information can be fed back to the detailed router which can help avoid routes that are susceptible to heavy coupling effects. The first part of this thesis explains why beforehand knowledge of a net’s coupling capacitance is crucial for a timely tape-out. This thesis revisits the Elmore delay model and extends the analysis to coupled RC structures. The notion of considering the coupling capacitance as a random variable is described to model the uncertainties that are introduced into the delay analysis which is performed ahead in time. The second part of this thesis illustrates how congestion analysis can provide valuable information about the severity of coupling effects. A method for the expedited extraction of estimated parasitics using congestion maps and global router solutions is presented. Modification to existing driving-point analysis techniques is suggested to accommodate coupled RC structures with probabilistic coupling capacitance. The last part of this thesis compares the delay metrics obtained from an open-source timing analyzer with the delay metrics obtained through methods described in this thesis for a given net

    High-Performance and Wavelength-Reused Optical Network on Chip (ONoC) Architectures and Communication Schemes for Manycore Processor

    Get PDF
    Optical Network on Chip (ONoC) is an emerging chip-scale optical interconnection technology to realize the high-performance and power-efficient inter-core communication for many-core processors. By utilizing the silicon photonic interconnects to transmit data packets with optical signals, it can achieve ultra low communication delay, high bandwidth capacity, and low power dissipation. With the benefits of Wavelength Division Multiplexing (WDM), multiple optical signals can simultaneously be transmitted in the same optical interconnect through different wavelengths. Thus, the WDM-based ONoC is becoming a hot research topic recently. However, the maximal number of available wavelengths is restricted for the reliable and power-efficient optical communication in ONoC. Hence, with a limited number of wavelengths, the design of high-performance and power-efficient ONoC architecture is an important and challenging problem. In this thesis, the design methodology of wavelength-reused ONoC architecture is explored. With the wavelength reuse scheme in optical routing paths, high-performance and power-efficient communication is realized for many-core processors only using a small number of available wavelengths. Three wavelength-reused ONoC architectures and communication schemes are proposed to fulfil different communication requirements, i.e., network scalability, multicast communication, and dark silicon. Firstly, WRH-ONoC, a wavelength-reused hierarchical Optical Network on Chip architecture, is proposed to achieve high network scalability, namely obtaining low communication delay and high throughput capacity for hundreds of thousands of cores by reusing the limited number of available wavelengths with the modest hardware cost and energy overhead. WRH-ONoC combines the advantages of non-blocking communication in each lambda-router and wavelength reuse in all lambda-routers through the hierarchical networking. Both theoretical analysis and simulation results indicate that WRH-ONoC can achieve prominent improvement on the communication performance and scalability (e.g., 46.0% of reduction on the zero-load packet delay and 72.7% of improvement on the network throughput for 400 cores with small hardware cost and energy overhead) in comparison with existing schemes. Secondly, DWRMR, a dynamical wavelength-reused multicast scheme based on the optical multicast ring, is proposed for widely existing multicast communications in many-core processors. In DWRMR, an optical multicast ring is dynamically constructed for each multicast group and the multicast packets are transmitted in a single-send-multi-receive manner requiring only one wavelength. All the cores in the same multicast group can reuse the established multicast ring through an optical token arbitration scheme for the interactive multicast communications, thereby avoiding the frequent construction of multicast routing paths dedicatedly for each core. Simulation results indicate that DWRMR can reduce more than 50% of end-to-end packet delay with slight hardware cost, or require only half number of wavelengths to achieve the same performance compared with existing schemes. Thirdly, Dark-ONoC, a dynamically configurable ONoC architecture, is proposed for the many-core processor with dark silicon. Dark silicon is an inevitable phenomenon that only a small number of cores can be activated simultaneously while the other cores must stay in dark state (power-gated) due to the restricted power budget. Dark-ONoC periodically allocates non-blocking optical routing paths only between the active cores with as less wavelengths as possible. Thus, it can obtain high-performance communication and low power consumption at the same time. Extensive simulations are conducted with the dark silicon patterns from both synthetic distribution and real data traces. The simulation results indicate that the number of wavelengths is reduced by around 15% and the overall power consumption is reduced by 23.4% compared to existing schemes. Finally, this thesis concludes several important principles on the design of wavelength-reused ONoC architecture, and summarizes some perspective issues for the future research

    RFiof: An RF Approach to I/O-pin and Memory Controller Scalability for Off-chip Memories

    Get PDF
    Given the maintenance of Moore's law behavior, core count is expected to continue growing, which keeps demanding more memory bandwidth destined to feed them. Memory controller (MC) scalability is crucial to achieve these bandwidth needs, but constrained by I/O pin scaling. In this study, we introduce RFiof, a radio-frequency (RF) memory approach to address I/O pin constraints which restrict MC scalability in off-chip-memory systems, while keeping interconnection energy at lower levels. In this paper, we model, design, and demonstrate how RFiof achieves high MC I/O pin scalability for different memory technology generations, while evaluating its area and power/energy impact. By introducing the novel concept of RFpins -- to replace traditional MC I/O pins, and using RFMCs - MCs coupled to RF transmitters (TX)/receivers (RX), while employing a minimal RF-path between RFMC and ranks, we demonstrate that for a 32-out-of-order multicore configured with off-chip ranks with a 1:1 core-to-MC ratio, RFiof presents scalable 4 RFpins per RFMC -comparable to pin-scalable optical solutions - and is able to respectively improve bandwidth and performance by up to 7.2x and 8.6x, compared to the traditional baseline -- constrained to MC I/O pin counts. Furthermore, RFiof reduces about 65.6% of MC area usage, and 80% of memory path energy interconnection

    Floorplan-Aware High Performance NoC Design

    Full text link
    Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables. En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip. Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red. En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuidRoca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844Palanci
    corecore