2,322 research outputs found

    Algorithms for Circuit Sizing in VLSI Design

    Get PDF
    One of the key problems in the physical design of computer chips, also known as integrated circuits, consists of choosing a  physical layout  for the logic gates and memory circuits (registers) on the chip. The layouts have a high influence on the power consumption and area of the chip and the delay of signal paths.  A discrete set of predefined layouts  for each logic function and register type with different physical properties is given by a library. One of the most influential characteristics of a circuit defined by the layout is its size. In this thesis we present new algorithms for the problem of choosing sizes for the circuits and its continuous relaxation,  and  evaluate these in theory and practice. A popular approach is based on Lagrangian relaxation and projected subgradient methods. We show that seemingly heuristic modifications that have been proposed for this approach can be theoretically justified by applying the well-known multiplicative weights algorithm. Subsequently, we propose a new model for the sizing problem as a min-max resource sharing problem. In our context, power consumption and signal delays are represented by resources that are distributed to customers. Under certain assumptions we obtain a polynomial time approximation for the continuous relaxation of the sizing problem that improves over the Lagrangian relaxation based approach. The new resource sharing algorithm has been implemented as part of the BonnTools software package which is developed at the Research Institute for Discrete Mathematics at the University of Bonn in cooperation with IBM. Our experiments on the ISPD 2013 benchmarks and state-of-the-art microprocessor designs provided by IBM illustrate that the new algorithm exhibits more stable convergence behavior compared to a Lagrangian relaxation based algorithm. Additionally, better timing and reduced power consumption was achieved on almost all instances. A subproblem of the new algorithm consists of finding sizes minimizing a weighted sum of power consumption and signal delays. We describe a method that approximates the continuous relaxation of this problem in polynomial time under certain assumptions. For the discrete problem we provide a fully polynomial approximation scheme under certain assumptions on the topology of the chip. Finally, we present a new algorithm for timing-driven optimization of registers. Their sizes and locations on a chip are usually determined during the clock network design phase, and remain mostly unchanged afterwards although the timing criticalities on which they were based can change. Our algorithm permutes register positions and sizes within so-called  clusters  without impairing the clock network such that it can be applied late in a design flow. Under mild assumptions, our algorithm finds an optimal solution which maximizes the worst cluster slack. It is implemented as part of the BonnTools and improves timing of registers on state-of-the-art microprocessor designs by up to 7.8% of design cycle time. </div

    Timing Closure in Chip Design

    Get PDF
    Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

    Large-scale mixed integer optimization approaches for scheduling airline operations under irregularity

    Get PDF
    Perhaps no single industry has benefited more from advancements in computation, analytics, and optimization than the airline industry. Operations Research (OR) is now ubiquitous in the way airlines develop their schedules, price their itineraries, manage their fleet, route their aircraft, and schedule their crew. These problems, among others, are well-known to industry practitioners and academics alike and arise within the context of the planning environment which takes place well in advance of the date of departure. One salient feature of the planning environment is that decisions are made in a frictionless environment that do not consider perturbations to an existing schedule. Airline operations are rife with disruptions caused by factors such as convective weather, aircraft failure, air traffic control restrictions, network effects, among other irregularities. Substantially less work in the OR community has been examined within the context of the real-time operational environment. While problems in the planning and operational environments are similar from a mathematical perspective, the complexity of the operational environment is exacerbated by two factors. First, decisions need to be made in as close to real-time as possible. Unlike the planning phase, decision-makers do not have hours of time to return a decision. Secondly, there are a host of operational considerations in which complex rules mandated by regulatory agencies like the Federal Administration Association (FAA), airline requirements, or union rules. Such restrictions often make finding even a feasible set of re-scheduling decisions an arduous task, let alone the global optimum. The goals and objectives of this thesis are found in Chapter 1. Chapter 2 provides an overview airline operations and the current practices of disruption management employed at most airlines. Both the causes and the costs associated with irregular operations are surveyed. The role of airline Operations Control Center (OCC) is discussed in which serves as the real-time decision making environment that is important to understand for the body of this work. Chapter 3 introduces an optimization-based approach to solve the Airline Integrated Recovery (AIR) problem that simultaneously solves re-scheduling decisions for the operating schedule, aircraft routings, crew assignments, and passenger itineraries. The methodology is validated by using real-world industrial data from a U.S. hub-and-spoke regional carrier and we show how the incumbent approach can dominate the incumbent sequential approach in way that is amenable to the operational constraints imposed by a decision-making environment. Computational effort is central to the efficacy of any algorithm present in a real-time decision making environment such as an OCC. The latter two chapters illustrate various methods that are shown to expedite more traditional large-scale optimization methods that are applicable a wide family of optimization problems, including the AIR problem. Chapter 4 shows how delayed constraint generation and column generation may be used simultaneously through use of alternate polyhedra that verify whether or not a given cut that has been generated from a subset of variables remains globally valid. While Benders' decomposition is a well-known algorithm to solve problems exhibiting a block structure, one possible drawback is slow convergence. Expediting Benders' decomposition has been explored in the literature through model reformulation, improving bounds, and cut selection strategies, but little has been studied how to strengthen a standard cut. Chapter 5 examines four methods for the convergence may be accelerated through an affine transformation into the interior of the feasible set, generating a split cut induced by a standard Benders' inequality, sequential lifting, and superadditive lifting over a relaxation of a multi-row system. It is shown that the first two methods yield the most promising results within the context of an AIR model.PhDCommittee Co-Chair: Clarke, John-Paul; Committee Co-Chair: Johnson, Ellis; Committee Member: Ahmed, Shabbir; Committee Member: Clarke, Michael; Committee Member: Nemhauser, Georg

    Timing optimization during the physical synthesis of cell-based VLSI circuits

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2016.Abstract : The evolution of CMOS technology made possible integrated circuits with billions of transistors assembled into a single silicon chip, giving rise to the jargon Very-Large-Scale Integration (VLSI). The required clock frequency affects the performance of a VLSI circuit and induces timing constraints that must be properly handled by synthesis tools. During the physical synthesis of VLSI circuits, several optimization techniques are used to iteratively reduce the number of timing violations until the target clock frequency is met. The dramatic increase of interconnect delay under technology scaling represents one of the major challenges for the timing closure of modern VLSI circuits. In this scenario, effective interconnect synthesis techniques play a major role. That is why this thesis targets two timing optimization problems for effective interconnect synthesis: Incremental Timing-Driven Placement (ITDP) and Incremental Timing-Driven Layer Assignment (ITLA). For solving the ITDP problem, this thesis proposes a new Lagrangian Relaxation formulation that minimizes timing violations for both setup and hold timing constraints. This work also proposes a netbased technique that uses Lagrange multipliers as net-weights, which are dynamically updated using an accurate timing analyzer. The netbased technique makes use of a novel discrete search to relocate cells by employing the Euclidean distance to define a proper neighborhood. For solving the ITLA problem, this thesis proposes a network flow approach that handles simultaneously critical and non-critical segments, and exploits a few flow conservation conditions to extract timing information for each net segment individually, thereby enabling the use of an external timing engine. The experimental validation using benchmark suites derived from industrial circuits demonstrates the effectiveness of the proposed techniques when compared with state-of-the-art works.A evolução da tecnologia CMOS viabilizou a fabricação de circuitos integrados contendo bilhões de transistores em uma única pastilha de silício, dando origem ao jargão Very-Large-Scale Integration (VLSI). A frequência-alvo de operação de um circuito VLSI afeta o seu desempenho e induz restrições de timing que devem ser manipuladas pelas ferramentas de síntese. Durante a síntese física de circuitos VLSI, diversas técnicas de otimização são usadas para iterativamente reduzir o número de violações de timing até que a frequência-alvo de operação seja atingida. O aumento dramático do atraso das interconexões devido à evolução tecnológica representa um dos maiores desafios para o fluxo de timing closure de circuitos VLSI contemporâneos. Nesse cenário, técnicas de síntese de interconexão eficientes têm um papel fundamental. Por este motivo, esta tese aborda dois problemas de otimização de timing para uma síntese eficiente das interconexões de um circuito VLSI: Incremental Timing-Driven Placement (ITDP) e Incremental Timing-Driven Layer Assignment (ITLA). Para resolver o problema de ITDP, esta tese propõe uma nova formulação utilizando Relaxação Lagrangeana que tem por objetivo a minimização simultânea das violações de timing para restrições do tipo setup e hold. Este trabalho também propõe uma técnica que utiliza multiplicadores de Lagrange como pesos para as interconexões, os quais são atualizados dinamicamente através dos resultados de uma ferramenta de análise de timing. Tal técnica realoca as células do circuito por meio de uma nova busca discreta que adota a distância Euclidiana como vizinhança.Para resolver o problema de ITLA, esta tese propõe uma abordagem em fluxo em redes que otimiza simultaneamente segmentos críticos e não-críticos, e explora algumas condições de fluxo para extrair as informações de timing para cada segmento individualmente, permitindo assim o uso de uma ferramenta de timing externa. A validação experimental, utilizando benchmarks derivados de circuitos industriais, demonstra a eficiência das técnicas propostas quando comparadas com trabalhos estado da arte

    Physical design algorithms for asynchronous circuits

    Get PDF
    Asynchronous designs have been demonstrated to be able to achieve both higher performance and lower power compared with their synchronous counterparts. It provides a very promising solution to the emerging challenges in advanced technology. However, due to the lack of proper EDA tool support, the design cycle for asynchronous circuits is much longer compared with the one for synchronous circuits. Thus, even with many advantages, asynchronous circuits are still not the mainstream in the industry. In this thesis, we provides several algorithms to resolve the emerging issues for the physical design of asynchronous circuits. Our proposed algorithms optimize asynchronous circuits using placement, gate sizing, repeater insertion and pipeline buffer insertion techniques. An incremental maximum cycle ratio algorithm is also proposed to speed up the timing analysis of asynchronous circuits

    Some Applications of the Weighted Combinatorial Laplacian

    Get PDF
    The weighted combinatorial Laplacian of a graph is a symmetric matrix which is the discrete analogue of the Laplacian operator. In this thesis, we will study a new application of this matrix to matching theory yielding a new characterization of factor-criticality in graphs and matroids. Other applications are from the area of the physical design of very large scale integrated circuits. The placement of the gates includes the minimization of a quadratic form given by a weighted Laplacian. A method based on the dual constrained subgradient method is proposed to solve the simultaneous placement and gate-sizing problem. A crucial step of this method is the projection to the flow space of an associated graph, which can be performed by minimizing a quadratic form given by the unweighted combinatorial Laplacian.Andwendungen der gewichteten kombinatorischen Laplace-Matrix Die gewichtete kombinatorische Laplace-Matrix ist das diskrete Analogon des Laplace-Operators. In dieser Arbeit stellen wir eine neuartige Charakterisierung von Faktor-Kritikalität von Graphen und Matroiden mit Hilfe dieser Matrix vor. Wir untersuchen andere Anwendungen im Bereich des Entwurfs von höchstintegrierten Schaltkreisen. Die Platzierung basiert auf der Minimierung einer quadratischen Form, die durch eine gewichtete kombinatorische Laplace-Matrix gegeben ist. Wir präsentieren einen Algorithmus für das allgemeine simultane Platzierungs- und Gattergrößen-Optimierungsproblem, der auf der dualen Subgradientenmethode basiert. Ein wichtiger Bestandteil dieses Verfahrens ist eine Projektion auf den Flussraum eines assoziierten Graphen, die als die Minimierung einer durch die Laplace-Matrix gegebenen quadratischen Form aufgefasst werden kann

    A Survey on Delay-Aware Resource Control for Wireless Systems --- Large Deviation Theory, Stochastic Lyapunov Drift and Distributed Stochastic Learning

    Full text link
    In this tutorial paper, a comprehensive survey is given on several major systematic approaches in dealing with delay-aware control problems, namely the equivalent rate constraint approach, the Lyapunov stability drift approach and the approximate Markov Decision Process (MDP) approach using stochastic learning. These approaches essentially embrace most of the existing literature regarding delay-aware resource control in wireless systems. They have their relative pros and cons in terms of performance, complexity and implementation issues. For each of the approaches, the problem setup, the general solution and the design methodology are discussed. Applications of these approaches to delay-aware resource allocation are illustrated with examples in single-hop wireless networks. Furthermore, recent results regarding delay-aware multi-hop routing designs in general multi-hop networks are elaborated. Finally, the delay performance of the various approaches are compared through simulations using an example of the uplink OFDMA systems.Comment: 58 pages, 8 figures; IEEE Transactions on Information Theory, 201

    Convexification of Queueing Formulas by Mixed-Integer Second-Order Cone Programming: An Application to a Discrete Location Problem with Congestion

    Full text link
    Mixed-Integer Second-Order Cone Programs (MISOCPs) form a nice class of mixed-inter convex programs, which can be solved very efficiently due to the recent advances in optimization solvers. Our paper bridges the gap between modeling a class of optimization problems and using MISOCP solvers. It is shown how various performance metrics of M/G/1 queues can be molded by different MISOCPs. To motivate our method practically, it is first applied to a challenging stochastic location problem with congestion, which is broadly used to design socially optimal service networks. Four different MISOCPs are developed and compared on sets of benchmark test problems. The new formulations efficiently solve large-size test problems, which cannot be solved by the best existing method. Then, the general applicability of our method is shown for similar optimization problems that use queue-theoretic performance measures to address customer satisfaction and service quality

    Network Maintenance and Capacity Management with Applications in Transportation

    Get PDF
    abstract: This research develops heuristics to manage both mandatory and optional network capacity reductions to better serve the network flows. The main application discussed relates to transportation networks, and flow cost relates to travel cost of users of the network. Temporary mandatory capacity reductions are required by maintenance activities. The objective of managing maintenance activities and the attendant temporary network capacity reductions is to schedule the required segment closures so that all maintenance work can be completed on time, and the total flow cost over the maintenance period is minimized for different types of flows. The goal of optional network capacity reduction is to selectively reduce the capacity of some links to improve the overall efficiency of user-optimized flows, where each traveler takes the route that minimizes the traveler’s trip cost. In this dissertation, both managing mandatory and optional network capacity reductions are addressed with the consideration of network-wide flow diversions due to changed link capacities. This research first investigates the maintenance scheduling in transportation networks with service vehicles (e.g., truck fleets and passenger transport fleets), where these vehicles are assumed to take the system-optimized routes that minimize the total travel cost of the fleet. This problem is solved with the randomized fixed-and-optimize heuristic developed. This research also investigates the maintenance scheduling in networks with multi-modal traffic that consists of (1) regular human-driven cars with user-optimized routing and (2) self-driving vehicles with system-optimized routing. An iterative mixed flow assignment algorithm is developed to obtain the multi-modal traffic assignment resulting from a maintenance schedule. The genetic algorithm with multi-point crossover is applied to obtain a good schedule. Based on the Braess’ paradox that removing some links may alleviate the congestion of user-optimized flows, this research generalizes the Braess’ paradox to reduce the capacity of selected links to improve the efficiency of the resultant user-optimized flows. A heuristic is developed to identify links to reduce capacity, and the corresponding capacity reduction amounts, to get more efficient total flows. Experiments on real networks demonstrate the generalized Braess’ paradox exists in reality, and the heuristic developed solves real-world test cases even when commercial solvers fail.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201
    • …
    corecore