926 research outputs found
Sampling-based Buffer Insertion for Post-Silicon Yield Improvement under Process Variability
At submicron manufacturing technology nodes process variations affect circuit
performance significantly. This trend leads to a large timing margin and thus
overdesign to maintain yield. To combat this pessimism, post-silicon clock
tuning buffers can be inserted into circuits to balance timing budgets of
critical paths with their neighbors. After manufacturing, these clock buffers
can be configured for each chip individually so that chips with timing failures
may be rescued to improve yield. In this paper, we propose a sampling-based
method to determine the proper locations of these buffers. The goal of this
buffer insertion is to reduce the number of buffers and their ranges, while
still maintaining a good yield improvement. Experimental results demonstrate
that our algorithm can achieve a significant yield improvement (up to 35%) with
only a small number of buffers.Comment: Design, Automation and Test in Europe (DATE), 201
Vocal rhythms in nesting Lusitanian toadfish, Halobatrachus didactylus
Males of several fish species aggregate and vocalize together, increasing the detection range of the sounds and
their chances of mating. In the Lusitanian toadfish (Halobatrachus didactylus), breeding males build nests under
rocks in close proximity and produce hundreds of boatwhistles (BW) an hour to attract females to lay their
demersal eggs on their nests. Chorusing behaviour includes fine-scale interactions between individuals, a
behavioural dynamic worth investigating in this highly vocal fish. Here we present a study to further investigate
this species’ vocal temporal patterns on a fine (individual rhythms and male-male interactions) and large (chorus
daily patterns) scales. Several datasets recorded in the Tagus estuary were labelled with the support of an
automatic recognition system based on hidden Markov models. Fine-scale vocal temporal patterns exhibit high
variability between and within individuals, varying from an almost isochronous to an apparent aperiodic pattern.
When in a chorus, males exhibited alternation or synchrony calling patterns, possibly depending on motivation
and social context (mating or male-male competition). When engaged in sustained calling, males usually
alternated vocalizations with their close neighbours thus avoiding superposition of calls. Synchrony was
observed mostly in fish with lower mean calling rate. Interaction patterns were less obvious in more distanced
males. Daily choruses showed periods with several active calling males and periods of low activity with no
significant diel patterns in shallower intertidal waters. Here, chorusing activity was mainly affected by tide level.
In contrast, at a deeper location, although tidal currents causes a decrease in calling rate, tide level did not
significantly influence calling, and there was a higher calling rate at night. These data show that photoperiod and
tide levels can influence broad patterns of Lusitanian toadfish calling activity as in other shallow-water fishes,
but fine temporal patterns in acoustic interactions among nesting males is more complex than previously known
for fishes.Fundação para a Ciência e Tecnologia - FCTinfo:eu-repo/semantics/publishedVersio
Distributed timing analysis
As design complexities continue to grow larger, the need to efficiently analyze circuit timing with billions of transistors across multiple modes and corners is quickly becoming the major bottleneck to the overall chip design closure process. To alleviate the long runtimes, recent trends are driving the need of distributed timing analysis (DTA) in electronic design automation (EDA) tools. However, DTA has received little research attention so far and remains a critical problem. In this thesis, we introduce several methods to approach DTA problems. We present a near-optimal algorithm to speed up the path-based timing analysis in Chapter 1. Path-based timing analysis is a key step in the overall timing flow to reduce unwanted pessimism, for example, common path pessimism removal (CPPR). In Chapter 2, we introduce a MapReduce-based distributed Path-based timing analysis framework that can scale up to hundreds of machines. In Chapter 3, we introduce our standalone timer, OpenTimer, an open-source high-performance timing analysis tool for very large scale integration (VLSI) systems. OpenTimer efficiently supports (1) both block-based and path-based timing propagations, (2) CPPR, and (3) incremental timing. OpenTimer works on industry formats (e.g., .v, .spef, .lib, .sdc) and is designed to be parallel and portable. To further facilitate integration between timing and timing-driven optimizations, OpenTimer provides user-friendly application programming interface (API) for inactive analysis. Experimental results on industry benchmarks re- leased from TAU 2015 timing analysis contest have demonstrated remarkable results achieved by OpenTimer, especially in its order-of-magnitude speedup over existing timers.
In Chapter 4 we present a DTA framework built on top of our standalone timer OpenTimer. We investigated into existing cluster computing frameworks from big data community and demonstrated DTA is a difficult fit here in terms of computation patterns and performance concern. Our specialized DTA framework supports (1) general design partitions (logical, physical, hierarchical, etc.) stored in a distributed file system, (2) non-blocking IO with event-driven programming for effective communication and computation overlap, and (3) an efficient messaging interface between application and network layers. The effectiveness and scalability of our framework has been evaluated on large hierarchical industry designs over a cluster with hundreds of machines.
In Chapter 5, we present our system DtCraft, a distributed execution engine for compute-intensive applications. Motivated by our DTA framework, DtCraft introduces a high-level programming model that lets users without detailed experience of distributed computing utilize the cluster resources. The major goal is to simplify the coding efforts on building distributed applications based on our system. In contrast to existing data-parallel cluster computing frameworks, DtCraft targets on high-performance or compute- intensive applications including simulations, modeling, and most EDA applications. Users describe a program in terms of a sequential stream graph associated with computation units and data streams. The DtCraft runtime transparently deals with the concurrency controls including work distribution, process communication, and fault tolerance. We have evaluated DtCraft on both micro-benchmarks and large-scale simulation and optimization problems, and showed the promising performance from single multi-core machines to clusters of computers
Project Final Report: HPC-Colony II
This report recounts the HPC Colony II Project which was a computer science effort funded by DOE's Advanced Scientific Computing Research office. The project included researchers from ORNL, IBM, and the University of Illinois at Urbana-Champaign. The topic of the effort was adaptive system software for extreme scale parallel machines. A description of findings is included
- …