95 research outputs found
A Lower Bound Technique for Communication in BSP
Communication is a major factor determining the performance of algorithms on
current computing systems; it is therefore valuable to provide tight lower
bounds on the communication complexity of computations. This paper presents a
lower bound technique for the communication complexity in the bulk-synchronous
parallel (BSP) model of a given class of DAG computations. The derived bound is
expressed in terms of the switching potential of a DAG, that is, the number of
permutations that the DAG can realize when viewed as a switching network. The
proposed technique yields tight lower bounds for the fast Fourier transform
(FFT), and for any sorting and permutation network. A stronger bound is also
derived for the periodic balanced sorting network, by applying this technique
to suitable subnetworks. Finally, we demonstrate that the switching potential
captures communication requirements even in computational models different from
BSP, such as the I/O model and the LPRAM
The role of terminal cost/reward in finite-horizon discrete-time LQ optimal control
AbstractThe optimal control problem for time-invariant linear systems with quadratic cost is considered for arbitrary, i.e., non-necessarily positive semidefinite, terminal cost matrices. A classification of such matrices is proposed, based on the maximum horizon for which there is a finite minimum cost for all initial states. When such an horizon is infinite, the classification is further refined, based on the asymptotic behavior of the optimal control law. A number of characterizations and other properties of the proposed classification are derived. In the study of the asymptotic behavior, a characterization is given of those matrices A such that the image of AtS0 converges in the gap metric for any subspace S0
Network-Oblivious Algorithms
A framework is proposed for the design and analysis of network-oblivious algorithms, namely algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and communication capabilities. The framework prescribes that a network-oblivious algorithm be specified on a parallel model of computation where the only parameter is the problem\u2019s input size, and then evaluated on a model with two parameters, capturing parallelism granularity and communication latency. It is shown that for a wide class of network-oblivious algorithms, optimality in the latter model implies optimality in the decomposable bulk synchronous parallel model, which is known to effectively describe a wide and significant class of parallel platforms. The proposed framework can be regarded as an attempt to port the notion of obliviousness, well established in the context of cache hierarchies, to the realm of parallel computation. Its effectiveness is illustrated by providing optimal network-oblivious algorithms for a number of key problems. Some limitations of the oblivious approach are also discussed
The DAG Visit Approach for Pebbling and I/O Lower Bounds
We introduce the notion of an r-visit of a Directed Acyclic Graph DAG G = (V,E), a sequence of the vertices of the DAG complying with a given rule r. A rule r specifies for each vertex v ? V a family of r-enabling sets of (immediate) predecessors: before visiting v, at least one of its enabling sets must have been visited. Special cases are the r^(top)-rule (or, topological rule), for which the only enabling set is the set of all predecessors and the r^(sin)-rule (or, singleton rule), for which the enabling sets are the singletons containing exactly one predecessor. The r-boundary complexity of a DAG G, b_r(G), is the minimum integer b such that there is an r-visit where, at each stage, for at most b of the vertices yet to be visited an enabling set has already been visited. By a reformulation of known results, it is shown that the boundary complexity of a DAG G is a lower bound to the pebbling number of the reverse DAG, G^R. Several known pebbling lower bounds can be cast in terms of the r^{(sin)}-boundary complexity. The main contributions of this paper are as follows:
- An existentially tight ?(?{d_{out} n}) upper bound to the r^(sin)-boundary complexity of any DAG of n vertices and out-degree d_{out}.
- An existentially tight ?(d_{out}/(log? d_{out}) log? n) upper bound to the r^(top)-boundary complexity of any DAG. (There are DAGs for which r^(top) provides a tight pebbling lower bound, whereas r^(sin) does not.)
- A visit partition technique for I/O lower bounds, which generalizes the S-partition I/O technique introduced by Hong and Kung in their classic paper "I/O complexity: The Red-Blue pebble game". The visit partition approach yields tight I/O bounds for some DAGs for which the S-partition technique can only yield a trivial lower bound
The Correctness of Tison's Method for Generating Prime Implicants
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryJoint Services Electronics Program / N00014-79-C-0424National Science Foundation / MCS 81-0555
The VLSI Optimality of the AKS Sorting Network
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryJoint Services Electronics Program / N00014-79-C-0424IBM Predoctoral Fellowship Progra
Optimal Eviction Policies for Stochastic Address Traces
The eviction problem for memory hierarchies is studied for the Hidden Markov
Reference Model (HMRM) of the memory trace, showing how miss minimization can
be naturally formulated in the optimal control setting. In addition to the
traditional version assuming a buffer of fixed capacity, a relaxed version is
also considered, in which buffer occupancy can vary and its average is
constrained. Resorting to multiobjective optimization, viewing occupancy as a
cost rather than as a constraint, the optimal eviction policy is obtained by
composing solutions for the individual addressable items.
This approach is then specialized to the Least Recently Used Stack Model
(LRUSM), a type of HMRM often considered for traces, which includes V-1
parameters, where V is the size of the virtual space. A gain optimal policy for
any target average occupancy is obtained which (i) is computable in time O(V)
from the model parameters, (ii) is optimal also for the fixed capacity case,
and (iii) is characterized in terms of priorities, with the name of Least
Profit Rate (LPR) policy. An O(log C) upper bound (being C the buffer capacity)
is derived for the ratio between the expected miss rate of LPR and that of OPT,
the optimal off-line policy; the upper bound is tightened to O(1), under
reasonable constraints on the LRUSM parameters. Using the stack-distance
framework, an algorithm is developed to compute the number of misses incurred
by LPR on a given input trace, simultaneously for all buffer capacities, in
time O(log V) per access.
Finally, some results are provided for miss minimization over a finite
horizon and over an infinite horizon under bias optimality, a criterion more
stringent than gain optimality.Comment: 37 pages, 3 figure
Merging and Sorting Networks with the Topology of the Omega Network
We consider a class of comparator networks obtained from the omega permutation network by replacing each switch with a comparator exchanger of arbitrary direction. These networks are all isomorphic to each other, have merging capabilities, and can be used as building blocks of sorting networks in ways different from the standard merge-sort scheme. It is shown that the bitonic merger and the balanced merger are members of the class. These two networks were not previously known to be isomorphic
The Area-Time Complexity of Sorting
Coordinated Science Laboratory changed its name from Control Systems LaboratoryIBM fellowshipNational Science Foundation / MCS 81-05552Joint Services Electronics Program / N00014-84-C-0149U of I OnlyRestricted to UIUC communit
- …