229 research outputs found
Achieving my dream: bringing LSE MPA students to compete with MBA students on impact analysis for business
I dream big when it comes to my aspiration to lead and deliver social change. I am thankful to the LSE Master of Public Administration (MPA) programme which supported my hope of leading a team of MPA students to represent LSE in the MBA Impact Investing Network Training (MIINT) at the Wharton Business School
An optimal parallel connectivity algorithm
AbstractA synchronized parallel algorithm of depth O(n2/p) for p (≤n2/log2 n) processors is given for the problem of computing connected components of an undirected graph. The speed-up of this algorithm is optimal in the sense that the depth of the algorithm is of the order of the running time of the fastest known sequential algorithm over the number of processors used
Empirical Challenge for NC Theory
Horn-satisfiability or Horn-SAT is the problem of deciding whether a
satisfying assignment exists for a Horn formula, a conjunction of clauses each
with at most one positive literal (also known as Horn clauses). It is a
well-known P-complete problem, which implies that unless P = NC, it is a hard
problem to parallelize. In this paper, we empirically show that, under a known
simple random model for generating the Horn formula, the ratio of
hard-to-parallelize instances (closer to the worst-case behavior) is
infinitesimally small. We show that the depth of a parallel algorithm for
Horn-SAT is polylogarithmic on average, for almost all instances, while keeping
the work linear. This challenges theoreticians and programmers to look beyond
worst-case analysis and come up with practical algorithms coupled with
respective performance guarantees.Comment: 10 pages, 5 figures. Accepted at HOPC'2
Duel and sweep algorithm for order-preserving pattern matching
Given a text and a pattern over alphabet , the classic exact
matching problem searches for all occurrences of pattern in text .
Unlike exact matching problem, order-preserving pattern matching (OPPM)
considers the relative order of elements, rather than their real values. In
this paper, we propose an efficient algorithm for OPPM problem using the
"duel-and-sweep" paradigm. Our algorithm runs in time in
general and time under an assumption that the characters in a string
can be sorted in linear time with respect to the string size. We also perform
experiments and show that our algorithm is faster that KMP-based algorithm.
Last, we introduce the two-dimensional order preserved pattern matching and
give a duel and sweep algorithm that runs in time for duel stage and
time for sweeping time with preprocessing time.Comment: 13 pages, 5 figure
Granularity of parallel memories
Consider algorithms which are designed for shared memory models of parallel computation in which processors are allowed to have fairly unrestricted access patterns to the shared memory. General fast simulations of such algorithms by parallel machines in which the shared memory is organized in modules where only one cell of each module can be accessed at a time are proposed. The paper provides a comprehensive study of the problem. The solution involves three stages:
(a) Before a simulation, distribute randomly the memory addresses among the memory modules.
(b) Keep several copies of each address and assign memory requests of processors to the "right\u27; copies at any time.
(c) Satisfy these assigned memory requests according to specifications of the parallel machine
Project for Developing Computer Science Agenda(s) for High-Performance Computing: An Organizer's Summary
Designing a coherent agenda for the implementation of the High
Performance Computing (HPC) program is a nontrivial technical challenge.
Many computer science and engineering researchers in the area of HPC, who
are affiliated with U.S. institutions, have been invited to contribute
their agendas. We have made a considerable effort to give many in that
research community the opportunity to write a position paper. This
explains why we view the project as placing a mirror in front of the
community, and hope that the mirror indeed reflects many of the opinions
on the topic.
The current paper is an organizer's summary and represents his reading
of the position papers. This summary is his sole responsibility. It is
respectfully submitted to the NSF.
(Also cross-referenced as UMIACS-TR-94-129
An Immediate Concurrent Execution (ICE) Abstraction Proposal for Many-Cores
Settling on a simple abstraction that programmers aim at, and hardware and software systems people enable and support, is an important step towards convergence to a robust many-core platform.
The current paper: (i) advocates incorporating a quest for the simplest possible abstraction in the debate on the future of many-core computers, (ii) suggests “immediate concurrent execution (ICE)” as a new abstraction, and (iii) argues that an XMT architecture is one possible demonstration of ICE providing an easy-to-program general-purpose many-core platform
Can Parallel Algorithms Enhance Serial Implementation?
Consider the serial emulation of a parallel algorithm. The thesis
presented in this paper is rather broad. It suggests that such a serial
emulation has the potential advantage of running on a serial machine
faster than a standard serial algorithm for the same problem.
The main concrete observation is very simple: just before the serial
emulation of a round of the parallel algorithm begins, the whole list of
memory addresses needed during this round is readily available; and, we
can start fetching all these addresses from secondary memories at this time.
This permits prefetching the data that will be needed in the next "time
window", perhaps by means of pipelining; these data will then be ready at
the fast memories when requested by the CPU. The possibility of
distributing memory addresses (or memory fetch units) at random over
memory modules, as has been proposed in the context of implementing the
parallel-random-access machine (PRAM) design space, is discussed.
This work also suggests that a multi-stage effort to build a parallel
machine may start with "parallel memories" and serial processing,
deferring parallel processing to a later stage. The general approach has
the following advantage: a user-friendly parallel programming language
can be used already in its first stage. This is in contrast to a practice
of compromising user-friendliness of parallel computer interfaces (i.e.,
parallel programming languages), and may offer a way for alleviating a
so-called "parallel software crisis".
It is too early to reach conclusions regarding the significance of the
thesis of this paper. Preliminary experimental results with respect to
the fundamental and practical problem of constructing suffix trees
indicate that drastic improvements in running time might be possible.
Serious attempts to follow it up are needed to determine its usefulness.
Parts of this paper are intentionally written in an informal way,
suppressing issues that will have to be resolved in the context of a
concrete implementation. The intention is to stimulate debate and provoke
suggestions and other specific approaches.
Validity of our thesis would imply that a standard computer science
curriculum, which prepares young graduates for a professional career of
over forty years, will have to include the topic of parallel algorithms
irrespective of whether (or when) parallel processing will succeed serial
processing in the general purpose computing market.
(Also cross-referenced as UMIACS-TR-91-145.1
Parallel unit propagation: Optimal speedup 3CNF Horn SAT
A linear work parallel algorithm for 3CNF Horn SAT is presented, which is interesting since the problem is P-complete
Data-Oblivious Graph Algorithms in Outsourced External Memory
Motivated by privacy preservation for outsourced data, data-oblivious
external memory is a computational framework where a client performs
computations on data stored at a semi-trusted server in a way that does not
reveal her data to the server. This approach facilitates collaboration and
reliability over traditional frameworks, and it provides privacy protection,
even though the server has full access to the data and he can monitor how it is
accessed by the client. The challenge is that even if data is encrypted, the
server can learn information based on the client data access pattern; hence,
access patterns must also be obfuscated. We investigate privacy-preserving
algorithms for outsourced external memory that are based on the use of
data-oblivious algorithms, that is, algorithms where each possible sequence of
data accesses is independent of the data values. We give new efficient
data-oblivious algorithms in the outsourced external memory model for a number
of fundamental graph problems. Our results include new data-oblivious
external-memory methods for constructing minimum spanning trees, performing
various traversals on rooted trees, answering least common ancestor queries on
trees, computing biconnected components, and forming open ear decompositions.
None of our algorithms make use of constant-time random oracles.Comment: 20 page
- …