25,026 research outputs found
On Characterizing the Data Access Complexity of Programs
Technology trends will cause data movement to account for the majority of
energy expenditure and execution time on emerging computers. Therefore,
computational complexity will no longer be a sufficient metric for comparing
algorithms, and a fundamental characterization of data access complexity will
be increasingly important. The problem of developing lower bounds for data
access complexity has been modeled using the formalism of Hong & Kung's
red/blue pebble game for computational directed acyclic graphs (CDAGs).
However, previously developed approaches to lower bounds analysis for the
red/blue pebble game are very limited in effectiveness when applied to CDAGs of
real programs, with computations comprised of multiple sub-computations with
differing DAG structure. We address this problem by developing an approach for
effectively composing lower bounds based on graph decomposition. We also
develop a static analysis algorithm to derive the asymptotic data-access lower
bounds of programs, as a function of the problem size and cache size
On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution
Technology trends are making the cost of data movement increasingly dominant,
both in terms of energy and time, over the cost of performing arithmetic
operations in computer systems. The fundamental ratio of aggregate data
movement bandwidth to the total computational power (also referred to the
machine balance parameter) in parallel computer systems is decreasing. It is
there- fore of considerable importance to characterize the inherent data
movement requirements of parallel algorithms, so that the minimal architectural
balance parameters required to support it on future systems can be well
understood. In this paper, we develop an extension of the well-known red-blue
pebble game to develop lower bounds on the data movement complexity for the
parallel execution of computational directed acyclic graphs (CDAGs) on parallel
systems. We model multi-node multi-core parallel systems, with the total
physical memory distributed across the nodes (that are connected through some
interconnection network) and in a multi-level shared cache hierarchy for
processors within a node. We also develop new techniques for lower bound
characterization of non-homogeneous CDAGs. We demonstrate the use of the
methodology by analyzing the CDAGs of several numerical algorithms, to develop
lower bounds on data movement for their parallel execution
Route Planning in Transportation Networks
We survey recent advances in algorithms for route planning in transportation
networks. For road networks, we show that one can compute driving directions in
milliseconds or less even at continental scale. A variety of techniques provide
different trade-offs between preprocessing effort, space requirements, and
query time. Some algorithms can answer queries in a fraction of a microsecond,
while others can deal efficiently with real-time traffic. Journey planning on
public transportation systems, although conceptually similar, is a
significantly harder problem due to its inherent time-dependent and
multicriteria nature. Although exact algorithms are fast enough for interactive
queries on metropolitan transit systems, dealing with continent-sized instances
requires simplifications or heavy preprocessing. The multimodal route planning
problem, which seeks journeys combining schedule-based transportation (buses,
trains) with unrestricted modes (walking, driving), is even harder, relying on
approximate solutions even for metropolitan inputs.Comment: This is an updated version of the technical report MSR-TR-2014-4,
previously published by Microsoft Research. This work was mostly done while
the authors Daniel Delling, Andrew Goldberg, and Renato F. Werneck were at
Microsoft Research Silicon Valle
Optimization of mesh hierarchies in Multilevel Monte Carlo samplers
We perform a general optimization of the parameters in the Multilevel Monte
Carlo (MLMC) discretization hierarchy based on uniform discretization methods
with general approximation orders and computational costs. We optimize
hierarchies with geometric and non-geometric sequences of mesh sizes and show
that geometric hierarchies, when optimized, are nearly optimal and have the
same asymptotic computational complexity as non-geometric optimal hierarchies.
We discuss how enforcing constraints on parameters of MLMC hierarchies affects
the optimality of these hierarchies. These constraints include an upper and a
lower bound on the mesh size or enforcing that the number of samples and the
number of discretization elements are integers. We also discuss the optimal
tolerance splitting between the bias and the statistical error contributions
and its asymptotic behavior. To provide numerical grounds for our theoretical
results, we apply these optimized hierarchies together with the Continuation
MLMC Algorithm. The first example considers a three-dimensional elliptic
partial differential equation with random inputs. Its space discretization is
based on continuous piecewise trilinear finite elements and the corresponding
linear system is solved by either a direct or an iterative solver. The second
example considers a one-dimensional It\^o stochastic differential equation
discretized by a Milstein scheme
Neural Distributed Autoassociative Memories: A Survey
Introduction. Neural network models of autoassociative, distributed memory
allow storage and retrieval of many items (vectors) where the number of stored
items can exceed the vector dimension (the number of neurons in the network).
This opens the possibility of a sublinear time search (in the number of stored
items) for approximate nearest neighbors among vectors of high dimension. The
purpose of this paper is to review models of autoassociative, distributed
memory that can be naturally implemented by neural networks (mainly with local
learning rules and iterative dynamics based on information locally available to
neurons). Scope. The survey is focused mainly on the networks of Hopfield,
Willshaw and Potts, that have connections between pairs of neurons and operate
on sparse binary vectors. We discuss not only autoassociative memory, but also
the generalization properties of these networks. We also consider neural
networks with higher-order connections and networks with a bipartite graph
structure for non-binary data with linear constraints. Conclusions. In
conclusion we discuss the relations to similarity search, advantages and
drawbacks of these techniques, and topics for further research. An interesting
and still not completely resolved question is whether neural autoassociative
memories can search for approximate nearest neighbors faster than other index
structures for similarity search, in particular for the case of very high
dimensional vectors.Comment: 31 page
Survey of Distributed Decision
We survey the recent distributed computing literature on checking whether a
given distributed system configuration satisfies a given boolean predicate,
i.e., whether the configuration is legal or illegal w.r.t. that predicate. We
consider classical distributed computing environments, including mostly
synchronous fault-free network computing (LOCAL and CONGEST models), but also
asynchronous crash-prone shared-memory computing (WAIT-FREE model), and mobile
computing (FSYNC model)
- …