5,670 research outputs found
Synthesizing Probabilistic Invariants via Doob's Decomposition
When analyzing probabilistic computations, a powerful approach is to first
find a martingale---an expression on the program variables whose expectation
remains invariant---and then apply the optional stopping theorem in order to
infer properties at termination time. One of the main challenges, then, is to
systematically find martingales.
We propose a novel procedure to synthesize martingale expressions from an
arbitrary initial expression. Contrary to state-of-the-art approaches, we do
not rely on constraint solving. Instead, we use a symbolic construction based
on Doob's decomposition. This procedure can produce very complex martingales,
expressed in terms of conditional expectations.
We show how to automatically generate and simplify these martingales, as well
as how to apply the optional stopping theorem to infer properties at
termination time. This last step typically involves some simplification steps,
and is usually done manually in current approaches. We implement our techniques
in a prototype tool and demonstrate our process on several classical examples.
Some of them go beyond the capability of current semi-automatic approaches
Four lectures on probabilistic methods for data science
Methods of high-dimensional probability play a central role in applications
for statistics, signal processing theoretical computer science and related
fields. These lectures present a sample of particularly useful tools of
high-dimensional probability, focusing on the classical and matrix Bernstein's
inequality and the uniform matrix deviation inequality. We illustrate these
tools with applications for dimension reduction, network analysis, covariance
estimation, matrix completion and sparse signal recovery. The lectures are
geared towards beginning graduate students who have taken a rigorous course in
probability but may not have any experience in data science applications.Comment: Lectures given at 2016 PCMI Graduate Summer School in Mathematics of
Data. Some typos, inaccuracies fixe
Run Time Approximation of Non-blocking Service Rates for Streaming Systems
Stream processing is a compute paradigm that promises safe and efficient
parallelism. Modern big-data problems are often well suited for stream
processing's throughput-oriented nature. Realization of efficient stream
processing requires monitoring and optimization of multiple communications
links. Most techniques to optimize these links use queueing network models or
network flow models, which require some idea of the actual execution rate of
each independent compute kernel within the system. What we want to know is how
fast can each kernel process data independent of other communicating kernels.
This is known as the "service rate" of the kernel within the queueing
literature. Current approaches to divining service rates are static. Modern
workloads, however, are often dynamic. Shared cloud systems also present
applications with highly dynamic execution environments (multiple users,
hardware migration, etc.). It is therefore desirable to continuously re-tune an
application during run time (online) in response to changing conditions. Our
approach enables online service rate monitoring under most conditions,
obviating the need for reliance on steady state predictions for what are
probably non-steady state phenomena. First, some of the difficulties associated
with online service rate determination are examined. Second, the algorithm to
approximate the online non-blocking service rate is described. Lastly, the
algorithm is implemented within the open source RaftLib framework for
validation using a simple microbenchmark as well as two full streaming
applications.Comment: technical repor
Gordon's inequality and condition numbers in conic optimization
The probabilistic analysis of condition numbers has traditionally been
approached from different angles; one is based on Smale's program in complexity
theory and features integral geometry, while the other is motivated by
geometric functional analysis and makes use of the theory of Gaussian
processes. In this note we explore connections between the two approaches in
the context of the biconic homogeneous feasiblity problem and the condition
numbers motivated by conic optimization theory. Key tools in the analysis are
Slepian's and Gordon's comparision inequalities for Gaussian processes,
interpreted as monotonicity properties of moment functionals, and their
interplay with ideas from conic integral geometry
- …