18,774 research outputs found
Throughput-driven floorplanning with wire pipelining
The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires
Recommended from our members
Nanometer VLSI placement and optimization for multi-objective design closure
In a VLSI physical synthesis flow, placement directly defines the interconnection,
which affects many other design objectives, such as timing, power consumption,
congestion, and thermal issues. With the scaling of technology, the relative interconnect
delay increases dramatically. As a result, placement has become a bottleneck
in deep sub-micron physical synthesis. In this dissertation, I propose several
optimization algorithms from global placement, placement migration, timing driven
placements, to incremental power optimizations for multi-objective VLSI design
closure. The first work is DPlace, a new global placement algorithm that scales
well to the modern large-scale circuit placement problems. DPlace simulates the
natural diffusion process to spread cells smoothly over the placement region, and
uses both analytical and discrete techniques to improve the wire length. However,
global placement is never sufficient for multi-objective design closure, a variety of
design objectives have to be improved incrementally, such as timing, routing congestion,
signal integrity, and heat distribution. Placement migration is a critical step
to address the cell overlaps appearing during incremental optimizations. To achieve
high placement stability, I propose a computational geometry based placement migration
flow to cope with placement changes, and a new stability metric to measure
the “similarity” between two placements accurately. Our placement migration algorithm
has clear advantage over conventional legalization algorithms such that the
neighborhood characteristics of the original placement are preserved. For timing
closure in high performance designs, I present a linear programming based incremental
timing driven placement to improve the timing on critical paths directly.
I further present an efficient timing driven placement algorithm (Pyramids). Two
formulations of Pyramids are proposed, which are suitable for different optimization
stages in a physical synthesis flow. Both approaches find the optimal location
for timing of a cell in constant time, through computational geometry based approaches.
For fast convergence of design closure, placement should be integrated
with other optimization techniques. I propose to combine placement, gate sizing
and Vt swapping techniques to reduce the total power consumption, especially the
leakage power, which is becoming increasingly critical for nanometer VLSI design
closure.Electrical and Computer Engineerin
High performance algorithms for large scale placement problem
Placement is one of the most important problems in electronic design automation (EDA). An inferior placement solution will not only affect the chip’s performance but might also make it nonmanufacturable by producing excessive wirelength, which is beyond available routing resources. Although placement has been extensively investigated for several decades, it is still a very challenging problem mainly due to that design scale has been dramatically increased by order of magnitudes and the increasing trend seems unstoppable. In modern design, chips commonly integrate millions of gates that require over tens of metal routing layers. Besides, new manufacturing techniques bring out new requests leading to that multi-objectives should be optimized simultaneously during placement.
Our research provides high performance algorithms for placement problem. We propose (i) a high performance global placement core engine POLAR; (ii) an efficient routability-driven placer POLAR 2.0, which is an extension of POLAR to deal with routing congestion; (iii) an ultrafast global placer POLAR 3.0, which explore parallelism on POLAR and can make full use of multi-core system; (iv) some efficient triple patterning lithography (TPL) aware detailed placement algorithms
Localizing Region-Based Active Contours
©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or distribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.DOI: 10.1109/TIP.2008.2004611In this paper, we propose a natural framework that allows any region-based segmentation energy to be re-formulated in a local way. We consider local rather than global image statistics and evolve a contour based on local information. Localized contours are capable of segmenting objects with heterogeneous feature profiles that would be difficult to capture correctly using a standard global method. The presented technique is versatile enough to be used with any global region-based active contour energy and instill in it the benefits of localization. We describe this framework and demonstrate the localization of three well-known energies in order to illustrate how our framework can be applied to any energy. We then compare each localized energy to its global counterpart to show the improvements that can be achieved. Next, an in-depth study of the behaviors of these energies in response to the degree of localization is given. Finally, we show results on challenging images to illustrate the robust and accurate segmentations that are possible with this new class of active contour models
Push recovery with stepping strategy based on time-projection control
In this paper, we present a simple control framework for on-line push
recovery with dynamic stepping properties. Due to relatively heavy legs in our
robot, we need to take swing dynamics into account and thus use a linear model
called 3LP which is composed of three pendulums to simulate swing and torso
dynamics. Based on 3LP equations, we formulate discrete LQR controllers and use
a particular time-projection method to adjust the next footstep location
on-line during the motion continuously. This adjustment, which is found based
on both pelvis and swing foot tracking errors, naturally takes the swing
dynamics into account. Suggested adjustments are added to the Cartesian 3LP
gaits and converted to joint-space trajectories through inverse kinematics.
Fixed and adaptive foot lift strategies also ensure enough ground clearance in
perturbed walking conditions. The proposed structure is robust, yet uses very
simple state estimation and basic position tracking. We rely on the physical
series elastic actuators to absorb impacts while introducing simple laws to
compensate their tracking bias. Extensive experiments demonstrate the
functionality of different control blocks and prove the effectiveness of
time-projection in extreme push recovery scenarios. We also show self-produced
and emergent walking gaits when the robot is subject to continuous dragging
forces. These gaits feature dynamic walking robustness due to relatively soft
springs in the ankles and avoiding any Zero Moment Point (ZMP) control in our
proposed architecture.Comment: 20 pages journal pape
AI/ML Algorithms and Applications in VLSI Design and Technology
An evident challenge ahead for the integrated circuit (IC) industry in the
nanometer regime is the investigation and development of methods that can
reduce the design complexity ensuing from growing process variations and
curtail the turnaround time of chip manufacturing. Conventional methodologies
employed for such tasks are largely manual; thus, time-consuming and
resource-intensive. In contrast, the unique learning strategies of artificial
intelligence (AI) provide numerous exciting automated approaches for handling
complex and data-intensive tasks in very-large-scale integration (VLSI) design
and testing. Employing AI and machine learning (ML) algorithms in VLSI design
and manufacturing reduces the time and effort for understanding and processing
the data within and across different abstraction levels via automated learning
algorithms. It, in turn, improves the IC yield and reduces the manufacturing
turnaround time. This paper thoroughly reviews the AI/ML automated approaches
introduced in the past towards VLSI design and manufacturing. Moreover, we
discuss the scope of AI/ML applications in the future at various abstraction
levels to revolutionize the field of VLSI design, aiming for high-speed, highly
intelligent, and efficient implementations
ANALYSIS AND VISUALIZATION OF FLOW FIELDS USING INFORMATION-THEORETIC TECHNIQUES AND GRAPH-BASED REPRESENTATIONS
Three-dimensional flow visualization plays an essential role in many areas of science and engineering, such as aero- and hydro-dynamical systems which dominate various physical and natural phenomena. For popular methods such as the streamline visualization to be effective, they should capture the underlying flow features while facilitating user observation and understanding of the flow field in a clear manner.
My research mainly focuses on the analysis and visualization of flow fields using various techniques, e.g. information-theoretic techniques and graph-based representations. Since the streamline visualization is a popular technique in flow field visualization, how to select good streamlines to capture flow patterns and how to pick good viewpoints to observe flow fields become critical. We treat streamline selection and viewpoint selection as symmetric problems and solve them simultaneously using the dual information channel [81]. To the best of my knowledge, this is the first attempt in flow visualization to combine these two selection problems in a unified approach. This work selects streamline in a view-independent manner and the selected streamlines will not change for all viewpoints. My another work [56] uses an information-theoretic approach to evaluate the importance of each streamline under various sample viewpoints and presents a solution for view-dependent streamline selection that guarantees coherent streamline update when the view changes gradually. When projecting 3D streamlines to 2D images for viewing, occlusion and clutter become inevitable. To address this challenge, we design FlowGraph [57, 58], a novel compound graph representation that organizes field line clusters and spatiotemporal regions hierarchically for occlusion-free and controllable visual exploration. We enable observation and exploration of the relationships among field line clusters, spatiotemporal regions and their interconnection in the transformed space. Most viewpoint selection methods only consider the external viewpoints outside of the flow field. This will not convey a clear observation when the flow field is clutter on the boundary side. Therefore, we propose a new way to explore flow fields by selecting several internal viewpoints around the flow features inside of the flow field and then generating a B-Spline curve path traversing these viewpoints to provide users with closeup views of the flow field for detailed observation of hidden or occluded internal flow features [54]. This work is also extended to deal with unsteady flow fields.
Besides flow field visualization, some other topics relevant to visualization also attract my attention. In iGraph [31], we leverage a distributed system along with a tiled display wall to provide users with high-resolution visual analytics of big image and text collections in real time. Developing pedagogical visualization tools forms my other research focus. Since most cryptography algorithms use sophisticated mathematics, it is difficult for beginners to understand both what the algorithm does and how the algorithm does that. Therefore, we develop a set of visualization tools to provide users with an intuitive way to learn and understand these algorithms
Custom Cell Placement Automation for Asynchronous VLSI
Asynchronous Very-Large-Scale-Integration (VLSI) integrated circuits have demonstrated many advantages over their synchronous counterparts, including low power consumption, elastic pipelining, robustness against manufacturing and temperature variations, etc. However, the lack of dedicated electronic design automation (EDA) tools, especially physical layout automation tools, largely limits the adoption of asynchronous circuits. Existing commercial placement tools are optimized for synchronous circuits, and require a standard cell library provided by semiconductor foundries to complete the physical design. The physical layouts of cells in this library have the same height to simplify the placement problem and the power distribution network. Although the standard cell methodology also works for asynchronous designs, the performance is inferior compared with counterparts designed using the full-custom design methodology. To tackle this challenge, we propose a gridded cell layout methodology for asynchronous circuits, in which the cell height and cell width can be any integer multiple of two grid values. The gridded cell approach combines the shape regularity of standard cells with the size flexibility of full-custom layouts. Therefore, this approach can achieve a better space utilization ratio and lower wire length for asynchronous designs. Experiments have shown that the gridded cell placement approach reduces area without impacting the routability. We have also used this placer to tape out a chip in a 65nm process technology, demonstrating that our placer generates design-rule clean results
Dance-the-music : an educational platform for the modeling, recognition and audiovisual monitoring of dance steps using spatiotemporal motion templates
In this article, a computational platform is presented, entitled “Dance-the-Music”, that can be used in a dance educational context to explore and learn the basics of dance steps. By introducing a method based on spatiotemporal motion templates, the platform facilitates to train basic step models from sequentially repeated dance figures performed by a dance teacher. Movements are captured with an optical motion capture system. The teachers’ models can be visualized from a first-person perspective to instruct students how to perform the specific dance steps in the correct manner. Moreover, recognition algorithms-based on a template matching method can determine the quality of a student’s performance in real time by means of multimodal monitoring techniques. The results of an evaluation study suggest that the Dance-the-Music is effective in helping dance students to master the basics of dance figures
GraPhSyM: Graph Physical Synthesis Model
In this work, we introduce GraPhSyM, a Graph Attention Network (GATv2) model
for fast and accurate estimation of post-physical synthesis circuit delay and
area metrics from pre-physical synthesis circuit netlists. Once trained,
GraPhSyM provides accurate visibility of final design metrics to early EDA
stages, such as logic synthesis, without running the slow physical synthesis
flow, enabling global co-optimization across stages. Additionally, the swift
and precise feedback provided by GraPhSym is instrumental for
machine-learning-based EDA optimization frameworks. Given a gate-level netlist
of a circuit represented as a graph, GraPhSyM utilizes graph structure,
connectivity, and electrical property features to predict the impact of
physical synthesis transformations such as buffer insertion and gate sizing.
When trained on a dataset of 6000 prefix adder designs synthesized at an
aggressive delay target, GraPhSyM can accurately predict the post-synthesis
delay (98.3%) and area (96.1%) metrics of unseen adders with a fast 0.22s
inference time. Furthermore, we illustrate the compositionality of GraPhSyM by
employing the model trained on a fixed delay target to accurately anticipate
post-synthesis metrics at a variety of unseen delay targets. Lastly, we report
promising generalization capabilities of the GraPhSyM model when it is
evaluated on circuits different from the adders it was exclusively trained on.
The results show the potential for GraPhSyM to serve as a powerful tool for
advanced optimization techniques and as an oracle for EDA machine learning
frameworks.Comment: Accepted at ICCAD'2
- …