Search CORE

1,596 research outputs found

Energy-Efficient Algorithms

Author: Axelsen H. B.
Frank M. P.
Koomey J. G.
Li M.
Maroney O. J. E.
Vieri C.
Vieri C. J.
Yates S.
Publication venue
Publication date: 01/01/2016
Field of study

We initiate the systematic study of the energy complexity of algorithms (in addition to time and space complexity) based on Landauer's Principle in physics, which gives a lower bound on the amount of energy a system must dissipate if it destroys information. We propose energy-aware variations of three standard models of computation: circuit RAM, word RAM, and transdichotomous RAM. On top of these models, we build familiar high-level primitives such as control logic, memory allocation, and garbage collection with zero energy complexity and only constant-factor overheads in space and time complexity, enabling simple expression of energy-efficient algorithms. We analyze several classic algorithms in our models and develop low-energy variations: comparison sort, insertion sort, counting sort, breadth-first search, Bellman-Ford, Floyd-Warshall, matrix all-pairs shortest paths, AVL trees, binary heaps, and dynamic arrays. We explore the time/space/energy trade-off and develop several general techniques for analyzing algorithms and reducing their energy complexity. These results lay a theoretical foundation for a new field of semi-reversible computing and provide a new framework for the investigation of algorithms.Comment: 40 pages, 8 pdf figures, full version of work published in ITCS 201

arXiv.org e-Print Archive

DSpace@MIT

Crossref

A sparse octree gravitational N-body code that runs entirely on the GPU processor

Author: Barnes
Barnes
Barnes
Belleman
Billeter
Buck
Burtscher
de Berg
Dehnen
Dubinski
Evghenii Gaburov
Fukushige
Gaburov
Gaburov
Hamada
Hamada
Harfst
Hut
Jeroen Bédorf
Knuth
Lauterbach
Makino
Makino
McMillan
Nyland
Plummer
Portegies Zwart
Portegies Zwart
Raman
Salmon
Satish
Simon Portegies Zwart
Springel
Warren
Yokota
Publication venue: 'Elsevier BV'
Publication date: 01/04/2012
Field of study

We present parallel algorithms for constructing and traversing sparse octrees on graphics processing units (GPUs). The algorithms are based on parallel-scan and sort methods. To test the performance and feasibility, we implemented them in CUDA in the form of a gravitational tree-code which completely runs on the GPU.(The code is publicly available at: http://castle.strw.leidenuniv.nl/software.html) The tree construction and traverse algorithms are portable to many-core devices which have support for CUDA or OpenCL programming languages. The gravitational tree-code outperforms tuned CPU code during the tree-construction and shows a performance improvement of more than a factor 20 overall, resulting in a processing rate of more than 2.8 million particles per second.Comment: Accepted version. Published in Journal of Computational Physics. 35 pages, 12 figures, single colum

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

Linear pattern matching on sparse suffix trees

Author: Kolpakov Roman
Kucherov Gregory
Starikovskaya Tatiana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/03/2011
Field of study

Packing several characters into one computer word is a simple and natural way to compress the representation of a string and to speed up its processing. Exploiting this idea, we propose an index for a packed string, based on a {\em sparse suffix tree} \cite{KU-96} with appropriately defined suffix links. Assuming, under the standard unit-cost RAM model, that a word can store up to

\log_{\sigma}n

characters (

\sigma

the alphabet size), our index takes

O(n/\log_{\sigma}n)

space, i.e. the same space as the packed string itself. The resulting pattern matching algorithm runs in time

O(m+r^2+r\cdot occ)

, where

m

is the length of the pattern,

r

is the actual number of characters stored in a word and

occ

is the number of pattern occurrences

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Parallel Weighted Random Sampling

Author: Sanders Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries

Dagstuhl Research Online Publication Server

Realizable paths and the NL vs L problem

Author: Prasad Kintali Shiva
Publication venue: Georgia Institute of Technology
Publication date: 29/08/2011
Field of study

A celebrated theorem of Savitch [Savitch'70] states that NSPACE(S) is contained in DSPACE(S²). In particular, Savitch gave a deterministic algorithm to solve ST-Connectivity (an NL-complete problem) using O({log}²{n}) space, implying NL (non-deterministic logspace) is contained in DSPACE({log}²{n}). While Savitch's theorem itself has not been improved in the last four decades, several graph connectivity problems are shown to lie between L and NL, providing new insights into the space-bounded complexity classes. All the connectivity problems considered in the literature so far are essentially special cases of ST-Connectivity. In this dissertation, we initiate the study of auxiliary PDAs as graph connectivity problems and define sixteen different "graph realizability problems" and study their relationships. The complexity of these connectivity problems lie between L (logspace) and P (polynomial time). ST-Realizability, the most general graph realizability problem is P-complete. 1DSTREAL(poly), the most specific graph realizability problem is L-complete. As special cases of our graph realizability problems we define two natural problems, Balanced ST-Connectivity and Positive Balanced ST-Connectivity, that lie between L and NL. We study the space complexity of SGSLOGCFL, a graph realizability problem lying between L and LOGCFL. We define generalizations of graph squaring and transitive closure, present efficient parallel algorithms for SGSLOGCFL and use the techniques of Trifonov to show that SGSLOGCFL is contained in DSPACE(lognloglogn). This implies that Balanced ST-Connectivity is contained in DSPACE(lognloglogn). We conclude with several interesting new research directions.PhDCommittee Chair: Richard Lipton; Committee Member: Anna Gal; Committee Member: Maria-Florina Balcan; Committee Member: Merrick Furst; Committee Member: William Coo

Scholarly Materials And Research @ Georgia Tech