7,557 research outputs found
Recommended from our members
Nanometer VLSI placement and optimization for multi-objective design closure
In a VLSI physical synthesis flow, placement directly defines the interconnection,
which affects many other design objectives, such as timing, power consumption,
congestion, and thermal issues. With the scaling of technology, the relative interconnect
delay increases dramatically. As a result, placement has become a bottleneck
in deep sub-micron physical synthesis. In this dissertation, I propose several
optimization algorithms from global placement, placement migration, timing driven
placements, to incremental power optimizations for multi-objective VLSI design
closure. The first work is DPlace, a new global placement algorithm that scales
well to the modern large-scale circuit placement problems. DPlace simulates the
natural diffusion process to spread cells smoothly over the placement region, and
uses both analytical and discrete techniques to improve the wire length. However,
global placement is never sufficient for multi-objective design closure, a variety of
design objectives have to be improved incrementally, such as timing, routing congestion,
signal integrity, and heat distribution. Placement migration is a critical step
to address the cell overlaps appearing during incremental optimizations. To achieve
high placement stability, I propose a computational geometry based placement migration
flow to cope with placement changes, and a new stability metric to measure
the “similarity” between two placements accurately. Our placement migration algorithm
has clear advantage over conventional legalization algorithms such that the
neighborhood characteristics of the original placement are preserved. For timing
closure in high performance designs, I present a linear programming based incremental
timing driven placement to improve the timing on critical paths directly.
I further present an efficient timing driven placement algorithm (Pyramids). Two
formulations of Pyramids are proposed, which are suitable for different optimization
stages in a physical synthesis flow. Both approaches find the optimal location
for timing of a cell in constant time, through computational geometry based approaches.
For fast convergence of design closure, placement should be integrated
with other optimization techniques. I propose to combine placement, gate sizing
and Vt swapping techniques to reduce the total power consumption, especially the
leakage power, which is becoming increasingly critical for nanometer VLSI design
closure.Electrical and Computer Engineerin
Intelligent visual media processing: when graphics meets vision
The computer graphics and computer vision communities have been working closely together in recent
years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media
around us. There are three major driving forces behind this phenomenon: i) the availability of big data from the
Internet has created a demand for dealing with the ever increasing, vast amount of resources; ii) powerful processing
tools, such as deep neural networks, provide e�ective ways for learning how to deal with heterogeneous visual data;
iii) new data capture devices, such as the Kinect, bridge between algorithms for 2D image understanding and
3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics
and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey
recent research on how computer vision techniques bene�t computer graphics techniques and vice versa, and cover
research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest
possible further research directions
Timing and Congestion Driven Algorithms for FPGA Placement
Placement is one of the most important steps in physical design for VLSI circuits. For field programmable gate arrays (FPGAs), the placement step determines the location of each logic block. I present novel timing and congestion driven placement algorithms for FPGAs with minimal runtime overhead. By predicting the post-routing timing-critical edges and estimating congestion accurately, this algorithm is able to simultaneously reduce the critical path delay and the minimum number of routing tracks. The core of the algorithm consists of a criticality-history record of connection edges and a congestion map. This approach is applied to the 20 largest Microelectronics Center of North Carolina (MCNC) benchmark circuits. Experimental results show that compared with the state-of-the-art FPGA place and route package, the Versatile Place and Route (VPR) suite, this algorithm yields an average of 8.1% reduction (maximum 30.5%) in the critical path delay and 5% reduction in channel width. Meanwhile, the average runtime of the algorithm is only 2.3X as of VPR
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
ADVANCED MOTION MODELS FOR RIGID AND DEFORMABLE REGISTRATION IN IMAGE-GUIDED INTERVENTIONS
Image-guided surgery (IGS) has been a major area of interest in recent decades that continues to transform surgical interventions and enable safer, less invasive procedures. In the preoperative contexts, diagnostic imaging, including computed tomography (CT) and magnetic resonance (MR) imaging, offers a basis for surgical planning (e.g., definition of target, adjacent anatomy, and the surgical path or trajectory to the target). At the intraoperative stage, such preoperative images and the associated planning information are registered to intraoperative coordinates via a navigation system to enable visualization of (tracked) instrumentation relative to preoperative images. A major limitation to such an approach is that motions during surgery, either rigid motions of bones manipulated during orthopaedic surgery or brain soft-tissue deformation in neurosurgery, are not captured, diminishing the accuracy of navigation systems.
This dissertation seeks to use intraoperative images (e.g., x-ray fluoroscopy and cone-beam CT) to provide more up-to-date anatomical context that properly reflects the state of the patient during interventions to improve the performance of IGS. Advanced motion models for inter-modality image registration are developed to improve the accuracy of both preoperative planning and intraoperative guidance for applications in orthopaedic pelvic trauma surgery and minimally invasive intracranial neurosurgery. Image registration algorithms are developed with increasing complexity of motion that can be accommodated (single-body rigid, multi-body rigid, and deformable) and increasing complexity of registration models (statistical models, physics-based models, and deep learning-based models).
For orthopaedic pelvic trauma surgery, the dissertation includes work encompassing: (i) a series of statistical models to model shape and pose variations of one or more pelvic bones and an atlas of trajectory annotations; (ii) frameworks for automatic segmentation via registration of the statistical models to preoperative CT and planning of fixation trajectories and dislocation / fracture reduction; and (iii) 3D-2D guidance using intraoperative fluoroscopy. For intracranial neurosurgery, the dissertation includes three inter-modality deformable registrations using physic-based Demons and deep learning models for CT-guided and CBCT-guided procedures
Timing optimization during the physical synthesis of cell-based VLSI circuits
Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2016.Abstract : The evolution of CMOS technology made possible integrated circuits with billions of transistors assembled into a single silicon chip, giving rise to the jargon Very-Large-Scale Integration (VLSI). The required clock frequency affects the performance of a VLSI circuit and induces timing constraints that must be properly handled by synthesis tools. During the physical synthesis of VLSI circuits, several optimization techniques are used to iteratively reduce the number of timing violations until the target clock frequency is met. The dramatic increase of interconnect delay under technology scaling represents one of the major challenges for the timing closure of modern VLSI circuits. In this scenario, effective interconnect synthesis techniques play a major role. That is why this thesis targets two timing optimization problems for effective interconnect synthesis: Incremental Timing-Driven Placement (ITDP) and Incremental Timing-Driven Layer Assignment (ITLA). For solving the ITDP problem, this thesis proposes a new Lagrangian Relaxation formulation that minimizes timing violations for both setup and hold timing constraints. This work also proposes a netbased technique that uses Lagrange multipliers as net-weights, which are dynamically updated using an accurate timing analyzer. The netbased technique makes use of a novel discrete search to relocate cells by employing the Euclidean distance to define a proper neighborhood. For solving the ITLA problem, this thesis proposes a network flow approach that handles simultaneously critical and non-critical segments, and exploits a few flow conservation conditions to extract timing information for each net segment individually, thereby enabling the use of an external timing engine. The experimental validation using benchmark suites derived from industrial circuits demonstrates the effectiveness of the proposed techniques when compared with state-of-the-art works.A evolução da tecnologia CMOS viabilizou a fabricação de circuitos integrados contendo bilhões de transistores em uma única pastilha de silício, dando origem ao jargão Very-Large-Scale Integration (VLSI). A frequência-alvo de operação de um circuito VLSI afeta o seu desempenho e induz restrições de timing que devem ser manipuladas pelas ferramentas de síntese. Durante a síntese física de circuitos VLSI, diversas técnicas de otimização são usadas para iterativamente reduzir o número de violações de timing até que a frequência-alvo de operação seja atingida. O aumento dramático do atraso das interconexões devido à evolução tecnológica representa um dos maiores desafios para o fluxo de timing closure de circuitos VLSI contemporâneos. Nesse cenário, técnicas de síntese de interconexão eficientes têm um papel fundamental. Por este motivo, esta tese aborda dois problemas de otimização de timing para uma síntese eficiente das interconexões de um circuito VLSI: Incremental Timing-Driven Placement (ITDP) e Incremental Timing-Driven Layer Assignment (ITLA). Para resolver o problema de ITDP, esta tese propõe uma nova formulação utilizando Relaxação Lagrangeana que tem por objetivo a minimização simultânea das violações de timing para restrições do tipo setup e hold. Este trabalho também propõe uma técnica que utiliza multiplicadores de Lagrange como pesos para as interconexões, os quais são atualizados dinamicamente através dos resultados de uma ferramenta de análise de timing. Tal técnica realoca as células do circuito por meio de uma nova busca discreta que adota a distância Euclidiana como vizinhança.Para resolver o problema de ITLA, esta tese propõe uma abordagem em fluxo em redes que otimiza simultaneamente segmentos críticos e não-críticos, e explora algumas condições de fluxo para extrair as informações de timing para cada segmento individualmente, permitindo assim o uso de uma ferramenta de timing externa. A validação experimental, utilizando benchmarks derivados de circuitos industriais, demonstra a eficiência das técnicas propostas quando comparadas com trabalhos estado da arte
Incremental Timing-Driven Placement with Displacement Constraint
In the modern deep-submicron Very Large Integrated Circuit(VLSI) design flow intercon-
nect delays are becoming major limiting factor for timing closure. Traditional placement
algorithms such as routability-driven placement (improves routability) and wirelength-
driven placement (reduces total wirelength) are no longer sufficient to close timing. To
this end, timing-driven placement plays a crucial role in reducing the interconnect delay
through timing critical paths (paths with timing violations/negative slacks) of the design
and thereby achieving specific performance/clock frequency.
In the placement flow, timing information about the design can be incorporated during
global placement and/or incremental/detailed placement. Although, over the years, there
has been significant advances in the quality of the global placement, there is a growing need
for high performance incremental timing-driven placement due to the lack of accurate
interconnect information during global placement. Moreover, incremental timing-driven
placement is essential to recover timing while preserving the other optimization objectives
such as total wirelength, routing congestion, and so forth which are optimized at the early
stages of the design flow.
This thesis proposes a simple, yet efficient, incremental timing-driven placement algo-
rithm that seeks to find optimized locations for standard cells so that the total negative
slack of the design can be maximized. Our algorithm consists two stages: (1) Global Move
which positions standard cells inside a critical bounding box to eliminate timing violations
on timing critical nets; and (2) Local Move which provides further timing improvement by
finely adjusting the current locations of the standard cells within a local region.
We evaluate our algorithm using ICCAD-2014 timing-driven placement contest bench-
marks. The results show that, on average, our technique eliminates 94% and 30% of the
late and early total negative slacks, respectively, and, 82% and 27% of the late and early
worst negative slacks, respectively, under short and long displacement constraints. The
1st-place team of the contest improves late and early total negative slacks by 90% and
39%, respectively, and improves late and early worst negative slack by 76% and 32%, re-
spectively. Taking into account both timing violation improvement and the placement
quality (i.e., other objectives), on average, we outperform the 1st-place team by 3% in
terms of the ICCAD-2014 contest quality score and our technique is 4.6× faster in terms
of runtime
Role of Proteome Physical Chemistry in Cell Behavior.
We review how major cell behaviors, such as bacterial growth laws, are derived from the physical chemistry of the cell's proteins. On one hand, cell actions depend on the individual biological functionalities of their many genes and proteins. On the other hand, the common physics among proteins can be as important as the unique biology that distinguishes them. For example, bacterial growth rates depend strongly on temperature. This dependence can be explained by the folding stabilities across a cell's proteome. Such modeling explains how thermophilic and mesophilic organisms differ, and how oxidative damage of highly charged proteins can lead to unfolding and aggregation in aging cells. Cells have characteristic time scales. For example, E. coli can duplicate as fast as 2-3 times per hour. These time scales can be explained by protein dynamics (the rates of synthesis and degradation, folding, and diffusional transport). It rationalizes how bacterial growth is slowed down by added salt. In the same way that the behaviors of inanimate materials can be expressed in terms of the statistical distributions of atoms and molecules, some cell behaviors can be expressed in terms of distributions of protein properties, giving insights into the microscopic basis of growth laws in simple cells
Multidisciplinary aerospace design optimization: Survey of recent developments
The increasing complexity of engineering systems has sparked increasing interest in multidisciplinary optimization (MDO). This paper presents a survey of recent publications in the field of aerospace where interest in MDO has been particularly intense. The two main challenges of MDO are computational expense and organizational complexity. Accordingly the survey is focussed on various ways different researchers use to deal with these challenges. The survey is organized by a breakdown of MDO into its conceptual components. Accordingly, the survey includes sections on Mathematical Modeling, Design-oriented Analysis, Approximation Concepts, Optimization Procedures, System Sensitivity, and Human Interface. With the authors' main expertise being in the structures area, the bulk of the references focus on the interaction of the structures discipline with other disciplines. In particular, two sections at the end focus on two such interactions that have recently been pursued with a particular vigor: Simultaneous Optimization of Structures and Aerodynamics, and Simultaneous Optimization of Structures Combined With Active Control
- …