Search CORE

12,601 research outputs found

Efficient parallel computation on multiprocessors with optical interconnection networks

Author: He Min
Publication venue: LSU Digital Commons
Publication date: 01/01/2002
Field of study

This dissertation studies optical interconnection networks, their architecture, address schemes, and computation and communication capabilities. We focus on a simple but powerful optical interconnection network model - the Linear Array with Reconfigurable pipelined Bus System (LARPBS). We extend the LARPBS model to a simplified higher dimensional LAPRBS and provide a set of basic computation operations. We then study the following two groups of parallel computation problems on both one dimensional LARPBS\u27s as well as multi-dimensional LARPBS\u27s: parallel comparison problems, including sorting, merging, and selection; Boolean matrix multiplication, transitive closure and their applications to connected component problems. We implement an optimal sorting algorithm on an n-processor LARPBS. With this optimal sorting algorithm at disposal, we study the sorting problem for higher dimensional LARPBS\u27s and obtain the following results: • An optimal basic Columnsort algorithm on a 2D LARPBS. • Two optimal two-way merge sort algorithms on a 2D LARPBS. • An optimal multi-way merge sorting algorithm on a 2D LARPBS. • An optimal generalized column sort algorithm on a 2D LARPBS. • An optimal generalized column sort algorithm on a 3D LARPBS. • An optimal 5-phase sorting algorithm on a 3D LARPBS. Results for selection problems are as follows: • A constant time maximum-finding algorithm on an LARPBS. • An optimal maximum-finding algorithm on an LARPBS. • An O((log log n)2) time parallel selection algorithm on an LARPBS. • An O(k(log log n)2) time parallel multi-selection algorithm on an LARPBS. While studying the computation and communication properties of the LARPBS model, we find Boolean matrix multiplication and its applications to the graph are another set of problem that can be solved efficiently on the LARPBS. Following is a list of results we have obtained in this area. • A constant time Boolean matrix multiplication algorithm. • An O(log n)-time transitive closure algorithm. • An O(log n)-time connected components algorithm. • An O(log n)-time strongly connected components algorithm. The results provided in this dissertation show the strong computation and communication power of optical interconnection networks

Louisiana State University

Recommended from our members

Survey of switching techniques in high-speed networks and their performance

Author: Kolson David
Miyahara Hideo
Murata Masayuki
Oie Yuji
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 12/10/1989
Field of study

One of the most promising approaches for high speed networks for integrated service applications is fast packet switching, or ATM (Asynchronous Transfer Mode). ATM can be characterized by very high speed transmission links and simple, hard wired protocols within a network. To match the transmission speed of the network links, and to minimize the overhead due to the processing of network protocols, the switching of cells is done in hardware switching fabrics in ATM networks.A number of designs has been proposed for implementing ATM switches. While many differences exist among the proposals, the vast majority of them is based on self-routing multi-stage interconnection networks. This is because of the desirable features of multi-stage interconnection networks such as self-routing capability and suitability for VLSI implementation.Existing ATM switch architectures can be classified into two major classes: blocking switches, where blockings of cells may occur within a switch when more than one cell contends for the same internal link, and non-blocking switches, where no internal blocking occurs. A large number of techniques has also been proposed to improve the performance of blocking and nonblocking switches. In this paper, we present an extensive survey of the existing proposals for ATM switch architectures, focusing on their performance issues

eScholarship - University of California

An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

Author: Arcas Abella Oriol
Armejach Adrià
Cristal Kestelman Adrián
Ghasempour Mohsen
Lujan Mikel
Mawer John
Navaridas Javier
Ndu Geoffrey
Song Wei
Sönmez Nehir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Random induced subgraphs of Cayley graphs induced by transpositions

Author: Jin Emma Y.
Reidys Christian M.
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we study random induced subgraphs of Cayley graphs of the symmetric group induced by an arbitrary minimal generating set of transpositions. A random induced subgraph of this Cayley graph is obtained by selecting permutations with independent probability,

\lambda_n

. Our main result is that for any minimal generating set of transpositions, for probabilities

\lambda_n=\frac{1+\epsilon_n}{n-1}

where

n^{-{1/3}+\delta}\le \epsilon_n0

, a random induced subgraph has a.s. a unique largest component of size

\wp(\epsilon_n)\frac{1+\epsilon_n}{n-1}n!

, where

\wp(\epsilon_n)

is the survival probability of a specific branching process.Comment: 18 pages, 1 figur

arXiv.org e-Print Archive

Elsevier - Publisher Connector

University of Southern Denmark Research Output

A system for routing arbitrary directed graphs on SIMD architectures

Author: Tomboulian Sherryl
Publication venue
Publication date
Field of study

There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal

NASA Technical Reports Server

Online Permutation Routing in Partitioned Optical Passive Star Networks

Author: Mei Alessandro
Rizzi Romeo
Publication venue
Publication date: 25/02/2005
Field of study

This paper establishes the state of the art in both deterministic and randomized online permutation routing in the POPS network. Indeed, we show that any permutation can be routed online on a POPS network either with

O(\frac{d}{g}\log g)

deterministic slots, or, with high probability, with

5c\lceil d/g\rceil+o(d/g)+O(\log\log g)

randomized slots, where constant

c=\exp (1+e^{-1})\approx 3.927

. When

d=\Theta(g)

, that we claim to be the "interesting" case, the randomized algorithm is exponentially faster than any other algorithm in the literature, both deterministic and randomized ones. This is true in practice as well. Indeed, experiments show that it outperforms its rivals even starting from as small a network as a POPS(2,2), and the gap grows exponentially with the size of the network. We can also show that, under proper hypothesis, no deterministic algorithm can asymptotically match its performance

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Archivio della ricerca- Università di Roma La Sapienza