2,536 research outputs found
Optimal Cell Clustering and Activation for Energy Saving in Load-Coupled Wireless Networks
Optimizing activation and deactivation of base station transmissions provides
an instrument for improving energy efficiency in cellular networks. In this
paper, we study optimal cell clustering and scheduling of activation duration
for each cluster, with the objective of minimizing the sum energy, subject to a
time constraint of delivering the users' traffic demand. The cells within a
cluster are simultaneously in transmission and napping modes, with cluster
activation and deactivation, respectively. Our optimization framework accounts
for the coupling relation among cells due to the mutual interference. Thus, the
users' achievable rates in a cell depend on the cluster composition. On the
theoretical side, we provide mathematical formulation and structural
characterization for the energy-efficient cell clustering and scheduling
optimization problem, and prove its NP hardness. On the algorithmic side, we
first show how column generation facilitates problem solving, and then present
our notion of local enumeration as a flexible and effective means for dealing
with the trade-off between optimality and the combinatorial nature of cluster
formation, as well as for the purpose of gauging the deviation from optimality.
Numerical results demonstrate that our solutions achieve more than 60% energy
saving over existing schemes, and that the solutions we obtain are within a few
percent of deviation from global optimum.Comment: Revision, IEEE Transactions on Wireless Communication
Timing Closure in Chip Design
Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips
Timing-Constrained Global Routing with Buffered Steiner Trees
This dissertation deals with the combination of two key problems that arise in the physical design of computer chips: global routing and buffering. The task of buffering is the insertion of buffers and inverters into the chip's netlist to speed-up signal delays and to improve electrical properties of the chip. Insertion of buffers and inverters goes alongside with construction of Steiner trees that connect logical sources with possibly many logical sinks and have buffers and inverters as parts of these connections. Classical global routing focuses on packing Steiner trees within the limited routing space. Buffering and global routing have been solved separately in the past. In this thesis we overcome the limitations of the classical approaches by considering the buffering problem as a global, multi-objective problem. We study its theoretical aspects and propose algorithms which we implement in the tool BonnRouteBuffer for timing-constrained global routing with buffered Steiner trees. At its core, we propose a new theoretically founded framework to model timing constraints inherently within global routing. As most important sub-task we have to compute a buffered Steiner tree for a single net minimizing the sum of prices for delays, routing congestion, placement congestion, power consumption, and net length. For this sub-task we present a fully polynomial time approximation scheme to compute an almost-cheapest Steiner tree with a given routing topology and prove that an exact algorithm cannot exist unless P=NP. For topology computation we present a bicriteria approximation algorithm that bounds both the geometric length and the worst slack of the topology. To improve the practical results we present many heuristic modifications, speed-up- and post-optimization techniques for buffered Steiner trees. We conduct experiments on challenging real-world test cases provided by our cooperation partner IBM to demonstrate the quality of our tool. Our new algorithm could produce better solutions with respect to both timing and routability. After post-processing with gate sizing and Vt-assignment, we can even reduce the power consumption on most instances. Overall, our results show that our tool BonnRouteBuffer for timing-constrained global routing is superior to industrial state-of-the-art tools
Timing-Constrained Global Routing with RC-Aware Steiner Trees and Routing Based Optimization
In this thesis we consider the global routing problem, which arises as one of the major subproblems in the physical design step in VLSI design. In global routing, we are given a three-dimensional grid graph G with edge capacities representing available routing space, and we have to connect a set of nets in G without overusing any edge capacities. Here, each net consists of a set of pins corresponding to vertices of G, where one pin is the sender of signals, while all other pins are receivers. Traditionally, next to obeying all edge capacity constraints, the objective has been to minimize wire length and possibly via (edges in z-direction) count, and timing constraints on the chip were only modeled indirectly. We present a new approach, where timing constraints are modeled directly during global routing: In joint work with Stephan Held, Dirk Mueller, Daniel Rotter, Vera Traub and Jens Vygen, we extend the modeling of global routing as a Min-Max Resource Sharing Problem to also incorporate timing constraints. For measuring signal delays we use the well-established Elmore delay model. One of the key subproblems here is the computation of Steiner trees minimizing a weighted sum of routing space usages and signal delays. For k pins, this problem is NP-hard to approximate within o(log k), and even the special case k = 2 is NP-hard, as was shown by Haehnle and Rotter. We present a fast approximation algorithm with strong approximation bounds for the case k = 2. For k > 2 we use a multi-stage approach based on modifying the topology of a short Steiner tree and using our algorithm for the two-pin case for computing new connections. Moreover, we present a layer assignment algorithm that assigns z-coordinates to the edges of a given two-dimensional tree. We also discuss the topic of routing based optimization. Here, the starting point is a complete routing, and timing optimization tools make changes that require incremental adaptations of the underlying routing. We investigate several aspects of this problem and derive a new routing flow that includes our timing-aware global router and routing based optimization steps. We evaluate our results from this thesis in practice on industrial 14nm microprocessor designs from IBM. Our theoretical results are validated in practice by a strong performance of our timing-aware global routing framework and our new routing flow, yielding significant improvements over the traditional global routing method and the previously used routing flow. Therefore, we conclude that our approaches and results from this thesis are not only theoretically sound but also give compelling results in practice
A Survey on Delay-Aware Resource Control for Wireless Systems --- Large Deviation Theory, Stochastic Lyapunov Drift and Distributed Stochastic Learning
In this tutorial paper, a comprehensive survey is given on several major
systematic approaches in dealing with delay-aware control problems, namely the
equivalent rate constraint approach, the Lyapunov stability drift approach and
the approximate Markov Decision Process (MDP) approach using stochastic
learning. These approaches essentially embrace most of the existing literature
regarding delay-aware resource control in wireless systems. They have their
relative pros and cons in terms of performance, complexity and implementation
issues. For each of the approaches, the problem setup, the general solution and
the design methodology are discussed. Applications of these approaches to
delay-aware resource allocation are illustrated with examples in single-hop
wireless networks. Furthermore, recent results regarding delay-aware multi-hop
routing designs in general multi-hop networks are elaborated. Finally, the
delay performance of the various approaches are compared through simulations
using an example of the uplink OFDMA systems.Comment: 58 pages, 8 figures; IEEE Transactions on Information Theory, 201
Current-Mode Techniques for the Implementation of Continuous- and Discrete-Time Cellular Neural Networks
This paper presents a unified, comprehensive approach
to the design of continuous-time (CT) and discrete-time
(DT) cellular neural networks (CNN) using CMOS current-mode
analog techniques. The net input signals are currents instead
of voltages as presented in previous approaches, thus avoiding
the need for current-to-voltage dedicated interfaces in image
processing tasks with photosensor devices. Outputs may be either
currents or voltages. Cell design relies on exploitation of current
mirror properties for the efficient implementation of both linear
and nonlinear analog operators. These cells are simpler and
easier to design than those found in previously reported CT
and DT-CNN devices. Basic design issues are covered, together
with discussions on the influence of nonidealities and advanced
circuit design issues as well as design for manufacturability
considerations associated with statistical analysis. Three prototypes
have been designed for l.6-pm n-well CMOS technologies.
One is discrete-time and can be reconfigured via local logic for
noise removal, feature extraction (borders and edges), shadow
detection, hole filling, and connected component detection (CCD)
on a rectangular grid with unity neighborhood radius. The other
two prototypes are continuous-time and fixed template: one for
CCD and other for noise removal. Experimental results are given
illustrating performance of these prototypes
Energy Consumption Of Visual Sensor Networks: Impact Of Spatio-Temporal Coverage
Wireless visual sensor networks (VSNs) are expected to play a major role in
future IEEE 802.15.4 personal area networks (PAN) under recently-established
collision-free medium access control (MAC) protocols, such as the IEEE
802.15.4e-2012 MAC. In such environments, the VSN energy consumption is
affected by the number of camera sensors deployed (spatial coverage), as well
as the number of captured video frames out of which each node processes and
transmits data (temporal coverage). In this paper, we explore this aspect for
uniformly-formed VSNs, i.e., networks comprising identical wireless visual
sensor nodes connected to a collection node via a balanced cluster-tree
topology, with each node producing independent identically-distributed
bitstream sizes after processing the video frames captured within each network
activation interval. We derive analytic results for the energy-optimal
spatio-temporal coverage parameters of such VSNs under a-priori known bounds
for the number of frames to process per sensor and the number of nodes to
deploy within each tier of the VSN. Our results are parametric to the
probability density function characterizing the bitstream size produced by each
node and the energy consumption rates of the system of interest. Experimental
results reveal that our analytic results are always within 7% of the energy
consumption measurements for a wide range of settings. In addition, results
obtained via a multimedia subsystem show that the optimal spatio-temporal
settings derived by the proposed framework allow for substantial reduction of
energy consumption in comparison to ad-hoc settings. As such, our analytic
modeling is useful for early-stage studies of possible VSN deployments under
collision-free MAC protocols prior to costly and time-consuming experiments in
the field.Comment: to appear in IEEE Transactions on Circuits and Systems for Video
Technology, 201
HIGH PERFORMANCE CLOCK DISTRIBUTION FOR HIGH-SPEED VLSI SYSTEMS
Tohoku University堀口 進課
- …