Search CORE

604 research outputs found

Timing Closure in Chip Design

Author: Held Stephan
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

bonndoc – Der Publikationsserver der Universität Bonn

On packet switch design

Author: Minkenberg C.J.A.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2001
Field of study

Repository TU/e

Pure OAI Repository

Recommended from our members

Nanometer VLSI placement and optimization for multi-objective design closure

Author: Luo Tao, Ph. D.
Publication venue
Publication date: 01/12/2007
Field of study

In a VLSI physical synthesis flow, placement directly defines the interconnection, which affects many other design objectives, such as timing, power consumption, congestion, and thermal issues. With the scaling of technology, the relative interconnect delay increases dramatically. As a result, placement has become a bottleneck in deep sub-micron physical synthesis. In this dissertation, I propose several optimization algorithms from global placement, placement migration, timing driven placements, to incremental power optimizations for multi-objective VLSI design closure. The first work is DPlace, a new global placement algorithm that scales well to the modern large-scale circuit placement problems. DPlace simulates the natural diffusion process to spread cells smoothly over the placement region, and uses both analytical and discrete techniques to improve the wire length. However, global placement is never sufficient for multi-objective design closure, a variety of design objectives have to be improved incrementally, such as timing, routing congestion, signal integrity, and heat distribution. Placement migration is a critical step to address the cell overlaps appearing during incremental optimizations. To achieve high placement stability, I propose a computational geometry based placement migration flow to cope with placement changes, and a new stability metric to measure the “similarity” between two placements accurately. Our placement migration algorithm has clear advantage over conventional legalization algorithms such that the neighborhood characteristics of the original placement are preserved. For timing closure in high performance designs, I present a linear programming based incremental timing driven placement to improve the timing on critical paths directly. I further present an efficient timing driven placement algorithm (Pyramids). Two formulations of Pyramids are proposed, which are suitable for different optimization stages in a physical synthesis flow. Both approaches find the optimal location for timing of a cell in constant time, through computational geometry based approaches. For fast convergence of design closure, placement should be integrated with other optimization techniques. I propose to combine placement, gate sizing and Vt swapping techniques to reduce the total power consumption, especially the leakage power, which is becoming increasingly critical for nanometer VLSI design closure.Electrical and Computer Engineerin

Texas ScholarWorks

Variation and power issues in VLSI clock networks

Author: Venkataraman Ganesh
Publication venue
Publication date: 15/05/2009
Field of study

Clock Distribution Network (CDN) is an important component of any synchronous logic circuit. The function of CDN is to deliver the clock signal to the clock sinks. Clock skew is defined as the difference in the arrival time of the clock signal at the clock sinks. Higher uncertainty in skew (due to PVT variations) degrades circuit performance by decreasing the maximum possible delay between any two sequential elements. Aggressive frequency scaling has also led to high power consumption especially in CDN. This dissertation addresses variation and power issues in the design of current and potential future CDN. The research detailed in this work presents algorithmic techniques for the following problems: (1) Variation tolerance in useful skew design, (2) Link insertion for buffered clock nets, (3) Methodology and algorithms for rotary clocking and (4) Clock mesh optimization for skew-power trade off. For clock trees this dissertation presents techniques to integrate the different aspects of clock tree synthesis (skew scheduling, abstract topology and layout embedding) into one framework- tolerance to variations. This research addresses the issues involved in inserting cross-links in a buffered clock tree and proposes design criteria to avoid the risk of short-circuit current. Rotary clocking is a promising new clocking scheme that consists of unterminated rings formed by differential transmission lines. Rotary clocking achieves reduction in power dissipation clock skew. This dissertation addresses the issues in adopting current CAD methodology to rotary clocks. Alternative methodology and corresponding algorithmic techniques are detailed. Clock mesh is a popular form of CDN used in high performance systems. The problem of simultaneous sizing and placement of mesh buffers in a clock mesh is addressed. The algorithms presented remove the edges from the clock mesh to trade off skew tolerance for low power. For clock trees as well as link insertion, our experiments indicate significant reduction in clock skew due to variations. For clock mesh, experimental results indicate 18.5% reduction in power with 1.3% delay penalty on a average. In summary, this dissertation details methodologies/algorithms that address two critical issues- variation and power dissipation in current and potential future CDN

Texas A&M Repository

Scheduling in Networks with Limited Buffers

Author: Elhaddad Mahmoud
Publication venue
Publication date: 30/09/2010
Field of study

In networks with limited buffer capacity, packet loss can occur at a link even when the average packet arrival rate is low compared to the link's speed. To offer strong loss-rateguarantees, ISPs may need to adopt stringent routing constraints to limit the load at the network links and the routing path length. However, to simultaneously maximize revenue, ISPs should be interested in scheduling algorithms that lead to the least stringent routing constraints. This work attempts to address the ISPs needs as follows. First, by proposing an algorithm that performs well (in terms of routing constraints) on networks of output queued (OQ) routers (that is, ideal routers), and second, by bounding the extra switch fabric speed and buffer capacity required for the emulationof these algorithms in combined input-output queued (CIOQ) routers.The first part of the thesis studies the problem of minimizing the maximum session loss rate in networks of OQ routers. It introduces the Rolling Priority algorithm, a local online scheduling algorithm that offers superior loss guarantees compared to FCFS/Drop Tail and FCFS/Random Drop. Rolling Priority has the following properties: (1) it does not favor any sessions over others at any link, (2) it ensures a proportion of packets from each session are subject to a negligibly small loss probability at every link along the session's path, and (3) maximizes the proportion of packets subject to negligible loss probability. The second part of the thesis studies the emulation of OQ routers using CIOQ. The OQ routers are equipped with a buffer of capacity B packets at every output. For the family of work-conserving scheduling algorithms, we find that whereas every greedy CIOQ policy is valid for the emulation of every OQ algorithm at speedup B, no CIOQ policy is valid at speedup less than the cubic root of B-2 when preemption is allowed. We also find that CCF, a well-studied CIOQ policy, is not valid at any speedup less than B. We then introduce a CIOQ policy CEH, that is valid at speedup greater than the square root of 2(B-1)

D-Scholarship@Pitt

NASA Tech Briefs, December 2011

Author
Publication venue
Publication date
Field of study

Topics covered include: 1) SNE Industrial Fieldbus Interface; 2) Composite Thermal Switch; 3) XMOS XC-2 Development Board for Mechanical Control and Data Collection; 4) Receiver Gain Modulation Circuit; 5) NEXUS Scalable and Distributed Next-Generation Avionics Bus for Space Missions; 6) Digital Interface Board to Control Phase and Amplitude of Four Channels; 7) CoNNeCT Baseband Processor Module; 8) Cryogenic 160-GHz MMIC Heterodyne Receiver Module; 9) Ka-Band, Multi-Gigabit-Per-Second Transceiver; 10) All-Solid-State 2.45-to-2.78-THz Source; 11) Onboard Interferometric SAR Processor for the Ka-Band Radar Interferometer (KaRIn); 12) Space Environments Testbed; 13) High-Performance 3D Articulated Robot Display; 14) Athena; 15) In Situ Surface Characterization; 16) Ndarts; 17) Cryo-Etched Black Silicon for Use as Optical Black; 18) Advanced CO2 Removal and Reduction System; 19) Correcting Thermal Deformations in an Active Composite Reflector; 20) Umbilical Deployment Device; 21) Space Mirror Alignment System; 22) Thermionic Power Cell To Harness Heat Energies for Geothermal Applications; 23) Graph Theory Roots of Spatial Operators for Kinematics and Dynamics; 24) Spacesuit Soft Upper Torso Sizing Systems; 25) Radiation Protection Using Single-Wall Carbon Nanotube Derivatives; 26) PMA-PhyloChip DNA Microarray to Elucidate Viable Microbial Community Structure; 27) Lidar Luminance Quantizer; 28) Distributed Capacitive Sensor for Sample Mass Measurement; 29) Base Flow Model Validation; 30) Minimum Landing Error Powered-Descent Guidance for Planetary Missions; 31) Framework for Integrating Science Data Processing Algorithms Into Process Control Systems; 32) Time Synchronization and Distribution Mechanisms for Space Networks; 33) Local Estimators for Spacecraft Formation Flying; 34) Software-Defined Radio for Space-to-Space Communications; 35) Reflective Occultation Mask for Evaluation of Occulter Designs for Planet Finding; and 36) Molecular Adsorber Coatin

NASA Technical Reports Server

HIGH PERFORMANCE CLOCK DISTRIBUTION FOR HIGH-SPEED VLSI SYSTEMS

Author: Xu Zhang
Publication venue
Publication date: 12/05/2008
Field of study

Tohoku University堀口進課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)

Space station data system analysis/architecture study. Task 4: System definition report

Author
Publication venue
Publication date
Field of study

Functional/performance requirements for the Space Station Data System (SSDS) are analyzed and architectural design concepts are derived and evaluated in terms of their performance and growth potential, technical feasibility and risk, and cost effectiveness. The design concepts discussed are grouped under five major areas: SSDS top-level architecture overview, end-to-end SSDS design and operations perspective, communications assumptions and traffic analysis, onboard SSDS definition, and ground SSDS definition

NASA Technical Reports Server

Contribution to the improvement of the performance of wireless mesh networks providing real time services

Author: Vázquez Rodas Andrés Marcelo
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

Nowadays, people expectations for ubiquitous connectivity is continuously growing. Cities are now moving towards the smart city paradigm. Electricity companies aims to become part of smart grids. Internet is no longer exclusive for humans, we now assume the Internet of everything. We consider that Wireless Mesh Networks (WMNs) have a set of valuable features that will make it an important part of such environments. WMNs can also be use in less favored areas thanks to their low-cost deployment. This is socially relevant since it facilitates the digital divide reduction and could help to improve the population quality of life. Research and industry have been working these years in open or proprietary mesh solutions. Standardization efforts and real deployments establish a solid starting point.We expect that WMNs will be a supporting part for an unlimited number of new applications from a variety of fields: community networking, intelligent transportation systems, health systems, public safety, disaster management, advanced metering, etc. For all these cases, the growing needs of users for real-time and multimedia information is currently evident. On this basis, this thesis proposes a set of contributions to improve the performance of an application service of such type and to promote the better use of two critical resources (memory and energy) of WMNs.For the offered service, this work focuses on a Video on Demand (VoD) system. One of the requirements of this system is the high capacity support. This is mainly achieved by distributing the video contents among various distribution points which in turn consist of several video servers. Each client request that arrives to such video server cluster must be handled by a specific server in a way that the load is balanced. For such task, this thesis proposes a mechanism to appropriately select a specific video server such that the transfer time at the cluster could be minimized.On the other hand, mesh routers that creates the mesh backbone are equipped with multiple interfaces from different technologies and channel types. An important resource is the amount of memory intended for buffers. The quality of service perceived by the users are largely affected by the size of such buffers. This is because important network performance parameters such as packet loss probability, delay, and channel utilization are highly affected by the buffer sizes. An efficient use of memory for buffering, in addition to facilitate the mesh devices scalability, also prevents the problems associated with excessively large buffers. Most of the current works associate the buffer sizing problem with the dynamics of TCP congestion control mechanism. Since this work focuses on real time services, in which the use of TCP is unfeasible, this thesis proposes a dynamic buffer sizing mechanism mainly dedicated for such real time flows. The approach is based on the maximum entropy principle and allows that each device be able to dynamically self-configure its buffers to achieve more efficient memory utilization. The proper performance of the proposal has been extensively evaluated in wired and wireless interfaces. Classical infrastructure-based wireless and multi-hop mesh interfaces have been considered. Finally, when the WMN is built by the interconnection of user hand-helds, energy is a limited and scarce resource, and therefore any approach to optimize its use is valuable. For this case, this thesis proposes a topology control mechanism based on centrality metrics. The main idea is that, instead of having all the devices executing routing functionalities, just a subset of nodes are selected for this task. We evaluate different centralities, form both centralized and distributed perspectives. In addition to the common random mobility models we include the analysis of the proposal with a socially-aware mobility model that generates networks with a community structure.Actualmente las expectativas de las personas de una conectividad ubicua están creciendo. Las ciudades están trabajando para alcanzar el paradigma de ciudades inteligentes. Internet ha dejado de ser exclusivo de las personas y ahora se asume el Internet de todo. Las redes inalámbricas de malla (WMNs) poseen un valioso conjunto de características que las harán parte importante de tales entornos. Las WMNs pueden utilizarse en zonas menos favorecidas debido a su despliegue económico. Esto es socialmente relevante ya que facilita la reducción de la brecha digital y puede ayudar a mejorar la calidad de vida de la población. Los esfuerzos de estandarización y los despliegues de redes reales establecen un punto de partida sólido.Se espera entonces, que las WMNs den soporte a un número importante de nuevas aplicaciones y servicios, de una variedad de campos: redes comunitarias, sistemas de transporte inteligente, sistemas de salud y seguridad, operaciones de rescate y de emergencia, etc. En todos estos casos, es evidente la necesidad de disponer de información multimedia y en tiempo real. En base a estos precedentes, esta tesis propone un conjunto de contribuciones para mejorar el funcionamiento de un servicio de este tipo y promover un uso eficiente de dos recursos críticos (memoria y energía) de las WMNs.Para el servicio ofrecido, este trabajo se centra en un sistema de video bajo demanda. Uno de los requisitos de estos sistemas es el de soportar capacidades elevadas. Esto se consigue principalmente distribuyendo los contenidos de video entre diferentes puntos de distribución, los cuales a su vez están formados por varios servidores. Cada solicitud de un cliente que llega a dicho conjunto de servidores debe ser manejada por un servidor específico, de tal forma que la carga sea balanceada. Para esta tarea, esta tesis propone un mecanismo que selecciona apropiadamente un servidor de tal manera que el tiempo de transferencia del sistema sea minimizado.Por su parte, los enrutadores de malla que crean la red troncal están equipados con múltiples interfaces de diferentes tecnologías y tipos de canal. Un recurso muy importante para éstos es la memoria destinada a sus colas. La calidad de servicio percibida por los usuarios está altamente influenciada por el tamaño de las colas. Esto porque parámetros importantes del rendimiento de la red como la probabilidad de pérdida de paquetes, el retardo, y la utilización del canal se ven afectados por dicho tamaño. Un uso eficiente de tal memoria, a más de facilitar la escalabilidad de los equipos, también evita los problemas asociados a colas muy largas. La mayoría de los trabajos actuales asocian el problema de dimensionamiento de las colas con la dinámica del mecanismo de control de congestión de TCP. Debido a que este trabajo se enfoca en servicios en tiempo real, en los cuales no es factible usar TCP, esta tesis propone un mecanismo de dimensionamiento dinámico de colas dedicado principalmente a flujos en tiempo real. La propuesta está basada en el principio de máxima entropía y permite que los dispositivos sean capaces de auto-configurar sus colas y así lograr un uso más eficiente de la memoria. Finalmente, cuando la WMN se construye a través de la interconexión de los dispositivos portátiles, la energía es un recurso limitado y escaso, y cualquier propuesta para optimizar su uso es muy valorada. Para esto, esta tesis propone un mecanismo de control de topología basado en métricas de centralidad. La idea principal es que en lugar de que todos los dispositivos realicen funciones de enrutamiento, solo un subconjunto de nodos es seleccionado para esta tarea. Se evalúan diferentes métricas, desde una perspectiva centralizada y otra distribuida. A más de los modelos aleatorios clásicos de movilidad, se incluye el análisis de la propuesta con modelos de movilidad basados en información social que toman en cuenta el comportamiento humano y generan redes con una clara estructura de comunidade

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa