48 research outputs found
Duel and sweep algorithm for order-preserving pattern matching
Given a text and a pattern over alphabet , the classic exact
matching problem searches for all occurrences of pattern in text .
Unlike exact matching problem, order-preserving pattern matching (OPPM)
considers the relative order of elements, rather than their real values. In
this paper, we propose an efficient algorithm for OPPM problem using the
"duel-and-sweep" paradigm. Our algorithm runs in time in
general and time under an assumption that the characters in a string
can be sorted in linear time with respect to the string size. We also perform
experiments and show that our algorithm is faster that KMP-based algorithm.
Last, we introduce the two-dimensional order preserved pattern matching and
give a duel and sweep algorithm that runs in time for duel stage and
time for sweeping time with preprocessing time.Comment: 13 pages, 5 figure
Empirical Speedup Study of Truly Parallel Data Compression
We present an empirical study of novel work-optimal parallel
algorithms for Burrows-Wheeler compression and decompression
of strings over a constant alphabet. To validate
these theoretical algorithms, we implement them on the experimental
XMT computing platform developed especially
for supporting parallel algorithms at the University of Maryland.
We show speedups of up to 25x for compression, and
13x for decompression, versus bzip2, the de facto standard
implementation of Burrows-Wheeler compression. Unlike
existing approaches, which assign an entire (e.g., 900KB)
block to a processor that processes the block serially, our
approach is “truly parallel” as it processes in parallel the
entire input. Besides the theoretical interest in solving the
“right” problem, the importance of data compression speed
for small inputs even at great expense of quality (compressed
size of data) is demonstrated by the introduction of Google’s
Snappy for MapReduce. Perhaps surprisingly, we show feasibility
of holding on to quality, while even beating Snappy
on speed.
In turn, this work adds new evidence in support of the
XMT/PRAM thesis: that an XMT-like many-core hardware/
software platform may be necessary for enabling general-purpose
parallel computing. Comparison of our results to recently
published work suggests 70x improvement over what
current commercial parallel hardware can achieve.NSF grants CCF-0811504 and CNS116185
Parallel Algorithms for Burrows-Wheeler Compression and Decompression
We present work-optimal PRAM algorithms for Burrows-Wheeler compression
and decompression of strings over a constant alphabet. For a string of
length n, the depth of the compression algorithm is O(log2 n), and the depth
of the the corresponding decompression algorithm is O(log n). These appear
to be the first polylogarithmic-time work-optimal parallel algorithms for any
standard lossless compression scheme.
The algorithms for the individual stages of compression and decompression
may also be of independent interest: 1. a novel O(log n)-time, O(n)-work
PRAM algorithm for Huffman decoding; 2. original insights into the stages of
the BW compression and decompression problems, bringing out parallelism
that was not readily apparent, allowing them to be mapped to elementary
parallel routines that have O(log n)-time, O(n)-work solutions, such as: (i)
prefix-sums problems with an appropriately-defined associative binary operator
for several stages, and (ii) list ranking for the final stage of decompression.NSF grant CCF-081150
Optimal (Randomized) Parallel Algorithms in the Binary-Forking Model
In this paper we develop optimal algorithms in the binary-forking model for a
variety of fundamental problems, including sorting, semisorting, list ranking,
tree contraction, range minima, and ordered set union, intersection and
difference. In the binary-forking model, tasks can only fork into two child
tasks, but can do so recursively and asynchronously. The tasks share memory,
supporting reads, writes and test-and-sets. Costs are measured in terms of work
(total number of instructions), and span (longest dependence chain).
The binary-forking model is meant to capture both algorithm performance and
algorithm-design considerations on many existing multithreaded languages, which
are also asynchronous and rely on binary forks either explicitly or under the
covers. In contrast to the widely studied PRAM model, it does not assume
arbitrary-way forks nor synchronous operations, both of which are hard to
implement in modern hardware. While optimal PRAM algorithms are known for the
problems studied herein, it turns out that arbitrary-way forking and strict
synchronization are powerful, if unrealistic, capabilities. Natural simulations
of these PRAM algorithms in the binary-forking model (i.e., implementations in
existing parallel languages) incur an overhead in span. This
paper explores techniques for designing optimal algorithms when limited to
binary forking and assuming asynchrony. All algorithms described in this paper
are the first algorithms with optimal work and span in the binary-forking
model. Most of the algorithms are simple. Many are randomized
A many-analysts approach to the relation between religiosity and well-being
The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N=10,535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β=0.120). For the second research question, this was the case for 65% of the teams (median reported β=0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates
A Many-analysts Approach to the Relation Between Religiosity and Well-being
The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N = 10, 535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β = 0.120). For the second research question, this was the case for 65% of the teams (median reported β = 0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates
A many-analysts approach to the relation between religiosity and well-being
The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N=10,535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β=0.120). For the second research question, this was the case for 65% of the teams (median reported β=0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates