Search CORE

21,204 research outputs found

High Performance Issues in Image Processing and Computer Vision

Author: Zhang Jingyuan
Publication venue: ODU Digital Commons
Publication date: 01/01/1992
Field of study

Typical image processing and computer vision tasks found in industrial, medical, and military applications require real-time solutions. These requirements have motivated the design of many parallel architectures and algorithms. Recently, a new architecture called the reconfigurable mesh has been proposed. This thesis addresses a number of problems in image processing and computer vision on reconfigurable meshes. We first show that a number of low-level descriptors of a digitized image such as the perimeter, area, histogram and median row can be reduced to computing the sum of all the integers in a matrix, which in turn can be reduced to computing the prefix sums of a binary sequence and the prefix sums of an integer sequence. We then propose a new computational paradigm for reconfigurable meshes, that is, identifying an entity by a bus and performing computations on the bus to obtain properties of the entity. Using the new paradigm, we solve a number of mid-level vision tasks including the Hough transform and component labeling. Finally, a VLSI-optimal constant time algorithm for computing the convex hull of a set of planar points is presented based on a VLSI-optimal constant time sorting algorithm. As by-products, two basic data movement techniques, computing the prefix sums of a binary sequence and computing the prefix maxima of a sequence of real numbers, and a VLSI-optimal constant time sorting algorithm have been developed. These by-products are interesting in their own right. In addition, they can be exploited to obtain efficient algorithms for a number of computational problems

Old Dominion University

Optimistic Parallelization of Floating-Point Accumulation

Author: DeHon André
Kapre Nachiket
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are "mostly" associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision floating-point adders on the Virtex-4 LX160 (-12) device where each floating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we demonstrate an average speedup of 6× with randomly generated data and 3-7× with summations extracted from Conjugate Gradient benchmarks

CiteSeerX

Crossref

Caltech Authors

ScholarlyCommons@Penn

On additive properties of sets defined by the Thue-Morse word

Author: Bucci Michelangelo
Hindman Neil
Puzynina Svetlana
Zamboni Luca Q.
Publication venue
Publication date: 01/01/2013
Field of study

In this paper we study some additive properties of subsets of the set \nats of positive integers: A subset

A

of \nats is called {\it

k

-summable} (where k\in\ben) if

A

contains \textstyle \big{\sum_{n\in F}x_n | \emp\neq F\subseteq {1,2,...,k\} \big} for some

k

-term sequence of natural numbers

x_1<x_2 < ... < x_k

. We say A \subseteq \nats is finite FS-big if

A

k

-summable for each positive integer

k

. We say is A \subseteq \nats is infinite FS-big if for each positive integer

k,

A

contains {\sum_{n\in F}x_n | \emp\neq F\subseteq \nats and #F\leq k} for some infinite sequence of natural numbers

x_1<x_2 < ...

. We say A\subseteq \nats is an IP-set if

A

contains {\sum_{n\in F}x_n | \emp\neq F\subseteq \nats and #F<\infty} for some infinite sequence of natural numbers

x_1<x_2 < ...

. By the Finite Sums Theorem [5], the collection of all IP-sets is partition regular, i.e., if

A

is an IP-set then for any finite partition of

A

, one cell of the partition is an IP-set. Here we prove that the collection of all finite FS-big sets is also partition regular. Let \TM =011010011001011010... denote the Thue-Morse word fixed by the morphism

0\mapsto 01

and

1\mapsto 10

. For each factor

u

of \TM we consider the set \TM\big|_u\subseteq \nats of all occurrences of

u

in \TM. In this note we characterize the sets \TM\big|_u in terms of the additive properties defined above. Using the Thue-Morse word we show that the collection of all infinite FS-big sets is not partition regular

arXiv.org e-Print Archive

HAL-UJM

Hal-Diderot

The Alternating Stock Size Problem and the Gasoline Puzzle

Author: Newman Alantha
Röglin Heiko
Seif Johanna
Publication venue
Publication date: 08/04/2018
Field of study

Given a set S of integers whose sum is zero, consider the problem of finding a permutation of these integers such that: (i) all prefix sums of the ordering are nonnegative, and (ii) the maximum value of a prefix sum is minimized. Kellerer et al. referred to this problem as the "Stock Size Problem" and showed that it can be approximated to within 3/2. They also showed that an approximation ratio of 2 can be achieved via several simple algorithms. We consider a related problem, which we call the "Alternating Stock Size Problem", where the number of positive and negative integers in the input set S are equal. The problem is the same as above, but we are additionally required to alternate the positive and negative numbers in the output ordering. This problem also has several simple 2-approximations. We show that it can be approximated to within 1.79. Then we show that this problem is closely related to an optimization version of the gasoline puzzle due to Lov\'asz, in which we want to minimize the size of the gas tank necessary to go around the track. We present a 2-approximation for this problem, using a natural linear programming relaxation whose feasible solutions are doubly stochastic matrices. Our novel rounding algorithm is based on a transformation that yields another doubly stochastic matrix with special properties, from which we can extract a suitable permutation

arXiv.org e-Print Archive

HAL-ENS-LYON

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Succinct Indexable Dictionaries with Applications to Encoding $k$ -ary Trees, Prefix Sums and Multisets

Author: Fich F. E.
Grossi R.
Hagerup T.
Hagerup T.
Jansson J.
Munro J. I.
Paul W. J.
Rajeev Raman
Raman R.
Raman V.
Srinivasa Rao Satti
Venkatesh Raman
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/05/2007
Field of study

We consider the {\it indexable dictionary} problem, which consists of storing a set

S \subseteq \{0,...,m-1\}

for some integer

m

, while supporting the operations of \Rank(x), which returns the number of elements in

S

that are less than

x

x \in S

, and -1 otherwise; and \Select(i) which returns the

i

-th smallest element in

S

. We give a data structure that supports both operations in O(1) time on the RAM model and requires

{\cal B}(n,m) + o(n) + O(\lg \lg m)

bits to store a set of size

n

, where {\cal B}(n,m) = \ceil{\lg {m \choose n}} is the minimum number of bits required to store any

n

-element subset from a universe of size

m

. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the

O(\lg \lg m)

additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: An information-theoretically optimal representation of a

k

-ary cardinal tree that supports standard operations in constant time, A representation of a multiset of size

n

from

\{0,...,m-1\}

{\cal B}(n,m+n) + o(n)

bits that supports (appropriate generalizations of) \Rank and \Select operations in constant time, and A representation of a sequence of

n

non-negative integers summing up to

m

{\cal B}(n,m+n) + o(n)

bits that supports prefix sum queries in constant time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report 2002/1

arXiv.org e-Print Archive

Crossref

Canonical Trees, Compact Prefix-free Codes and Sums of Unit Fractions: A Probabilistic Analysis

Author: Heuberger Clemens
Krenn Daniel
Wagner Stephan
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2015
Field of study

For fixed

t\ge 2

, we consider the class of representations of

1

as sum of unit fractions whose denominators are powers of

t

or equivalently the class of canonical compact

t

-ary Huffman codes or equivalently rooted

t

-ary plane "canonical" trees. We study the probabilistic behaviour of the height (limit distribution is shown to be normal), the number of distinct summands (normal distribution), the path length (normal distribution), the width (main term of the expectation and concentration property) and the number of leaves at maximum distance from the root (discrete distribution)

arXiv.org e-Print Archive

Stellenbosch University SUNScholar Repository

Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

Author: Bille Philip
Cording Patrick Hagge
Gørtz Inge Li
Skjoldjensen Frederik Rye
Vildhøj Hjalte Wedel
Vind Søren
Publication venue
Publication date: 01/01/2016
Field of study

Given a static reference string

R

and a source string

S

, a relative compression of

S

with respect to

R

is an encoding of

S

as a sequence of references to substrings of

R

. Relative compression schemes are a classic model of compression and have recently proved very successful for compressing highly-repetitive massive data sets such as genomes and web-data. We initiate the study of relative compression in a dynamic setting where the compressed source string

S

is subject to edit operations. The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. We present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates. To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for these problems. Plugging in our new results we also immediately obtain new bounds for the string indexing for patterns with wildcards problem and the dynamic text and static pattern matching problem

arXiv.org e-Print Archive

Online Research Database In Technology

Distributed Sparse Cut Approximation

Author: Kuhn Fabian
Molla Anisur Rahaman
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 19th International Conference on Principles of Distributed Systems (OPODIS 2015)
Publication date: 01/01/2016
Field of study

We study the problem of computing a sparse cut in an undirected network graph G=(V,E). We measure the sparsity of a cut (S,VS) by its conductance phi(S), i.e., by the ratio of the number of edges crossing the cut and the sum of the degrees on the smaller of the two sides. We present an efficient distributed algorithm to compute a cut of low conductance. Specifically, given two parameters b and phi, if there exists a cut of balance at least b and conductance at most phi, our algorithm outputs a cut of balance at least b/2 and conductance at most ~O(sqrt{phi}), where ~O(.) hides polylogarithmic factors in the number of nodes n. Our distributed algorithm works in the congest model, i.e., it only requires to send messages of size at most O(log(n)) bits. The time complexity of the algorithm is ~O(D + 1/b*phi), where D is the diameter of G. This is a significant improvement over a result by Das Sarma et al. [ICDCN 2015], where it is shown that a cut of the same quality can be computed in time ~O(n + 1/b*phi). The improved running time is in particular achieved by devising and applying an efficient distributed algorithm for the all-prefix-sums problem in a distributed search tree. This algorithm, which is based on the classic parallel all-prefix-sums algorithm, might be of independent interest

Dagstuhl Research Online Publication Server