Search CORE

6,168 research outputs found

Efficient Interconnection Schemes for VLSI and Parallel Computation

Author: Greenberg Ronald I.
Publication venue: Loyola eCommons
Publication date: 01/09/1989
Field of study

This thesis is primarily concerned with two problems of interconnecting components in VLSI technologies. In the first case, the goal is to construct efficient interconnection networks for general-purpose parallel computers. The second problem is a more specialized problem in the design of VLSI chips, namely multilayer channel routing. In addition, a final part of this thesis provides lower bounds on the area required for VLSI implementations of finite-state machines. This thesis shows that networks based on Leiserson\u27s fat-tree architecture are nearly as good as any network built in a comparable amount of physical space. It shows that these universal networks can efficiently simulate competing networks by means of an appropriate correspondence between network components and efficient algorithms for routing messages on the universal network. In particular, a universal network of area A can simulate competing networks with O(lg^3A) slowdown (in bit-times), using a very simple randomized routing algorithm and simple network components. Alternatively, a packet routing scheme of Leighton, Maggs, and Rao can be used in conjunction with more sophisticated switching components to achieve O(lg^2 A) slowdown. Several other important aspects of universality are also discussed. It is shown that universal networks can be constructed in area linear in the number of processors, so that there is no need to restrict the density of processors in competing networks. Also results are presented for comparisons between networks of different size or with processors of different sizes (as determined by the amount of attached memory). Of particular interest is the fact that a universal network built from sufficiently small processors can simulate (with the slowdown already quoted) any competing network of comparable size regardless of the size of processors in the competing network. In addition, many of the results given do not require the usual assumption of unit wire delay. Finally, though most of the discussion is in the two-dimensional world, the results are shown to apply in three dimensions by way of a simple demonstration of general results on graph layout in three dimensions. The second main problem considered in this thesis is channel routing when many layers of interconnect are available, a scenario that is becoming more and more meaningful as chip fabrication technologies advance. This thesis describes a system MulCh for multilayer channel routing which extends the Chameleon system developed at U. C. Berkeley. Like Chameleon, MulCh divides a multilayer problem into essentially independent subproblems of at most three layers, but unlike Chameleon, MulCh considers the possibility of using partitions comprised of a single layer instead of only partitions of two or three layers. Experimental results show that MulCh often performs better than Chameleon in terms of channel width, total net length, and number of vias. In addition to a description of MulCh as implemented, this thesis provides improved algorithms for subtasks performed by MulCh, thereby indicating potential improvements in the speed and performance of multilayer channel routing. In particular, a linear time algorithm is given for determining the minimum width required for a single-layer channel routing problem, and an algorithm is given for maintaining the density of a collection of nets in logarithmic time per net insertion. The last part of this thesis shows that straightforward techniques for implementing finite-state machines are optimal in the worst case. Specifically, for any s and k, there is a deterministic finite-state machine with s states and k symbols such that any layout algorithm requires (ks lg s) area to lay out its realization. For nondeterministic machines, there is an analogous lower bound of (ks^2) area

Loyola eCommons

Gossip Algorithms for Distributed Signal Processing

Author: Dimakis Alexandros G.
Kar Soummya
Moura Jose M. F.
Rabbat Michael G.
Scaglione Anna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, developing faster and more robust gossip algorithms and deriving theoretical performance guarantees. This article presents an overview of recent work in the area. We describe convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossiping. We discuss issues related to gossiping over wireless links, including the effects of quantization and noise, and we illustrate the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.Comment: Submitted to Proceedings of the IEEE, 29 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture

Author: Chan T-H. Hubert
Chung Kai-Min
Lin Wei-Kai
Shi Elaine
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

Massively Parallel Computation (MPC) is a model of computation widely believed to best capture realistic parallel computing architectures such as large-scale MapReduce and Hadoop clusters. Motivated by the fact that many data analytics tasks performed on these platforms involve sensitive user data, we initiate the theoretical exploration of how to leverage MPC architectures to enable efficient, privacy-preserving computation over massive data. Clearly if a computation task does not lend itself to an efficient implementation on MPC even without security, then we cannot hope to compute it efficiently on MPC with security. We show, on the other hand, that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we show the following results: - any MPC algorithm can be compiled to a communication-oblivious counterpart while asymptotically preserving its round and space complexity, where communication-obliviousness ensures that any network intermediary observing the communication patterns learn no information about the secret inputs; - assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against an adversary who controls not only intermediate network routers but additionally up to 1/3 - ? fraction of machines (for an arbitrarily small constant ?) - moreover, this compilation preserves the round complexity tightly, and preserves the space complexity upto a multiplicative security parameter related blowup. As an initial exploration of this important direction, our work suggests new definitions and proposes novel protocols that blend algorithmic and cryptographic techniques

Dagstuhl Research Online Publication Server

Cryptology ePrint Archive

Throughput Optimal On-Line Algorithms for Advanced Resource Reservation in Ultra High-Speed Networks

Author: Cohen Reuven
Fazlollahi Niloofar
Starobinski David
Publication venue
Publication date: 02/11/2007
Field of study

Advanced channel reservation is emerging as an important feature of ultra high-speed networks requiring the transfer of large files. Applications include scientific data transfers and database backup. In this paper, we present two new, on-line algorithms for advanced reservation, called BatchAll and BatchLim, that are guaranteed to achieve optimal throughput performance, based on multi-commodity flow arguments. Both algorithms are shown to have polynomial-time complexity and provable bounds on the maximum delay for 1+epsilon bandwidth augmented networks. The BatchLim algorithm returns the completion time of a connection immediately as a request is placed, but at the expense of a slightly looser competitive ratio than that of BatchAll. We also present a simple approach that limits the number of parallel paths used by the algorithms while provably bounding the maximum reduction factor in the transmission throughput. We show that, although the number of different paths can be exponentially large, the actual number of paths needed to approximate the flow is quite small and proportional to the number of edges in the network. Simulations for a number of topologies show that, in practice, 3 to 5 parallel paths are sufficient to achieve close to optimal performance. The performance of the competitive algorithms are also compared to a greedy benchmark, both through analysis and simulation.Comment: 9 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX