Search CORE

273 research outputs found

On the average running time of odd-even merge sort

Author: Rüb C.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1995
Field of study

This paper is concerned with the average running time of Batcher's odd-even merge sort when implemented on a collection of processors. We consider the case where

n

, the size of the input, is an arbitrary multiple of the number

p

of processors used. We show that Batcher's odd-even merge (for two sorted lists of length

n

each) can be implemented to run in time

O((n/p)(\log (2+p^2/n)))

on the average, and that odd-even merge sort can be implemented to run in time

O((n/p)(\log n+\log p\log (2+p^2/n)))

on the average. In the case of merging (sorting), the average is taken over all possible outcomes of the merging (all possible permutations of

n

elements). That means that odd-even merge and odd-even merge sort have an optimal average running time if

n\geq p^2

. The constants involved are also quite small

MPG.PuRe

An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

Author: Arcas Abella Oriol
Armejach Adrià
Cristal Kestelman Adrián
Ghasempour Mohsen
Lujan Mikel
Mawer John
Navaridas Javier
Ndu Geoffrey
Song Wei
Sönmez Nehir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

High Level Synthesis (HLS) languages and tools are emerging as the most promising technique to make FPGAs more accessible to software developers. Nevertheless, picking the most suitable HLS for a certain class of algorithms depends on requirements such as area and throughput, as well as on programmer experience. In this paper, we explore the different trade-offs present when using a representative set of HLS tools in the context of Database Management Systems (DBMS) acceleration. More specifically, we conduct an empirical analysis of four representative frameworks (Bluespec SystemVerilog, Altera OpenCL, LegUp and Chisel) that we utilize to accelerate commonly-used database algorithms such as sorting, the median operator, and hash joins. Through our implementation experience and empirical results for database acceleration, we conclude that the selection of the most suitable HLS depends on a set of orthogonal characteristics, which we highlight for each HLS framework.Peer ReviewedPostprint (author’s final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

An FPGA Implementation of Kak's Instantaneously-Trained, Fast-Classification Neural Networks

Author: Sutton P. R.
Zhu J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Motivated by a biologically plausible short-memory sketchpad, Kak's Fast Classification (FC) neural networks are instantaneously trained by using a prescriptive training scheme. Both weights and the topology for an FC network are specified with only two presentations of the training samples. Compared with iterative learning algorithms such as Backpropagation (which may require many thousands of presentations of the training data), the training of FC networks is extremely fast and learning convergence is always guaranteed. Thus FC networks are suitable for applications where real-time classification and adaptive filtering are needed. In this paper we show that FC networks are "hardware friendly" for implementation on FPGAs. Their unique prescriptive learning scheme can be integrated with the hardware design of the FC network through parameterization and compile-time constant folding

Crossref

University of Queensland eSpace

A taxonomy of parallel sorting

Author: Bitton Dina
DeWitt David J.
Hsiao David K.
Menon Jaishankar
Publication venue
Publication date: 01/04/1984
Field of study

TR 84-601In this paper, we propose a taxonomy of parallel sorting that includes a broad range of array and file sorting algorithms. We analyze the evolution of research on parallel sorting, from the earliest sorting networks to the shared memory algorithms and the VLSI sorters. In the context of sorting networks, we describe two fundamental parallel merging schemes - the odd-even and the bitonic merge. Sorting algorithms have been derived from these merging algorithms for parallel computers where processors communicate through interconnection networks such as the perfect shuffle, the mesh and a number of other sparse networks. After describing the network sorting algorithms, we show that, with a shared memory model of parallel computation, faster algorithms have been derived from parallel enumeration sorting schemes, where keys are first ranked and then rearranged according to their rank

eCommons@Cornell

Calhoun, Institutional Archive of the Naval Postgraduate School