18,255 research outputs found
Importance sampling the union of rare events with an application to power systems analysis
We consider importance sampling to estimate the probability of a union
of rare events defined by a random variable . The
sampler we study has been used in spatial statistics, genomics and
combinatorics going back at least to Karp and Luby (1983). It works by sampling
one event at random, then sampling conditionally on that event
happening and it constructs an unbiased estimate of by multiplying an
inverse moment of the number of occuring events by the union bound. We prove
some variance bounds for this sampler. For a sample size of , it has a
variance no larger than where is the union
bound. It also has a coefficient of variation no larger than
regardless of the overlap pattern among the
events. Our motivating problem comes from power system reliability, where the
phase differences between connected nodes have a joint Gaussian distribution
and the rare events arise from unacceptably large phase differences. In the
grid reliability problems even some events defined by constraints in
dimensions, with probability below , are estimated with a
coefficient of variation of about with only sample
values
Lectures on Randomized Numerical Linear Algebra
This chapter is based on lectures on Randomized Numerical Linear Algebra from
the 2016 Park City Mathematics Institute summer school on The Mathematics of
Data.Comment: To appear in the edited volume of lectures from the 2016 PCMI summer
schoo
On the Distributed Complexity of Large-Scale Graph Computations
Motivated by the increasing need to understand the distributed algorithmic
foundations of large-scale graph computations, we study some fundamental graph
problems in a message-passing model for distributed computing where
machines jointly perform computations on graphs with nodes (typically, ). The input graph is assumed to be initially randomly partitioned among
the machines, a common implementation in many real-world systems.
Communication is point-to-point, and the goal is to minimize the number of
communication {\em rounds} of the computation.
Our main contribution is the {\em General Lower Bound Theorem}, a theorem
that can be used to show non-trivial lower bounds on the round complexity of
distributed large-scale data computations. The General Lower Bound Theorem is
established via an information-theoretic approach that relates the round
complexity to the minimal amount of information required by machines to solve
the problem. Our approach is generic and this theorem can be used in a
"cookbook" fashion to show distributed lower bounds in the context of several
problems, including non-graph problems. We present two applications by showing
(almost) tight lower bounds for the round complexity of two fundamental graph
problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our
approach, as demonstrated in the case of PageRank, can yield tight lower bounds
for problems (including, and especially, under a stochastic partition of the
input) where communication complexity techniques are not obvious.
Our approach, as demonstrated in the case of triangle enumeration, can yield
stronger round lower bounds as well as message-round tradeoffs compared to
approaches that use communication complexity techniques
- …