115 research outputs found
Data-driven coarse graining in action: Modeling and prediction of complex systems
In many physical, technological, social, and economic applications, one is commonly faced with the task of estimating statistical properties, such as mean first passage times of a temporal continuous process, from empirical data (experimental observations). Typically, however, an accurate and reliable estimation of such properties directly from the data alone is not possible as the time series is often too short, or the particular phenomenon of interest is only rarely observed. We propose here a theoretical-computational framework which provides us with a systematic and rational estimation of statistical quantities of a given temporal process, such as waiting times between subsequent bursts of activity in intermittent signals. Our framework is illustrated with applications from real-world data sets, ranging from marine biology to paleoclimatic data
Efficient Density Estimation via Piecewise Polynomial Approximation
We give a highly efficient "semi-agnostic" algorithm for learning univariate
probability distributions that are well approximated by piecewise polynomial
density functions. Let be an arbitrary distribution over an interval
which is -close (in total variation distance) to an unknown probability
distribution that is defined by an unknown partition of into
intervals and unknown degree- polynomials specifying over each of
the intervals. We give an algorithm that draws \tilde{O}(t\new{(d+1)}/\eps^2)
samples from , runs in time \poly(t,d,1/\eps), and with high probability
outputs a piecewise polynomial hypothesis distribution that is
(O(\tau)+\eps)-close (in total variation distance) to . This sample
complexity is essentially optimal; we show that even for , any
algorithm that learns an unknown -piecewise degree- probability
distribution over to accuracy \eps must use \Omega({\frac {t(d+1)}
{\poly(1 + \log(d+1))}} \cdot {\frac 1 {\eps^2}}) samples from the
distribution, regardless of its running time. Our algorithm combines tools from
approximation theory, uniform convergence, linear programming, and dynamic
programming.
We apply this general algorithm to obtain a wide range of results for many
natural problems in density estimation over both continuous and discrete
domains. These include state-of-the-art results for learning mixtures of
log-concave distributions; mixtures of -modal distributions; mixtures of
Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions;
mixtures of Gaussians; and mixtures of -monotone densities. Our general
technique yields computationally efficient algorithms for all these problems,
in many cases with provably optimal sample complexities (up to logarithmic
factors) in all parameters
Efficiently Learning Structured Distributions from Untrusted Batches
We study the problem, introduced by Qiao and Valiant, of learning from
untrusted batches. Here, we assume users, all of whom have samples from
some underlying distribution over . Each user sends a batch
of i.i.d. samples from this distribution; however an -fraction of
users are untrustworthy and can send adversarially chosen responses. The goal
is then to learn in total variation distance. When this is the
standard robust univariate density estimation setting and it is well-understood
that error is unavoidable. Suprisingly, Qiao and Valiant
gave an estimator which improves upon this rate when is large.
Unfortunately, their algorithms run in time exponential in either or .
We first give a sequence of polynomial time algorithms whose estimation error
approaches the information-theoretically optimal bound for this problem. Our
approach is based on recent algorithms derived from the sum-of-squares
hierarchy, in the context of high-dimensional robust estimation. We show that
algorithms for learning from untrusted batches can also be cast in this
framework, but by working with a more complicated set of test functions.
It turns out this abstraction is quite powerful and can be generalized to
incorporate additional problem specific constraints. Our second and main result
is to show that this technology can be leveraged to build in prior knowledge
about the shape of the distribution. Crucially, this allows us to reduce the
sample complexity of learning from untrusted batches to polylogarithmic in
for most natural classes of distributions, which is important in many
applications. To do so, we demonstrate that these sum-of-squares algorithms for
robust mean estimation can be made to handle complex combinatorial constraints
(e.g. those arising from VC theory), which may be of independent technical
interest.Comment: 46 page
PAC learning using Nadaraya-Watson estimator based on orthonormal systems
Regression or function classes of Euclidean type with compact support and certain smoothness properties are shown to be PAC learnable by the Nadaraya-Watson estimator based on complete orthonormal systems. While requiring more smoothness properties than typical PAC formulations, this estimator is computationally efficient, easy to implement, and known to perform well in a number of practical applications. The sample sizes necessary for PAC learning of regressions or functions under sup norm cost are derived for a general orthonormal system. The result covers the widely used estimators based on Haar wavelets, trignometric functions, and Daubechies wavelets
Global Chronic Total Occlusion Crossing Algorithm: JACC State-of-the-Art Review
The authors developed a global chronic total occlusion crossing algorithm following 10 steps: 1) dual angiography; 2) careful angiographic review focusing on proximal cap morphology, occlusion segment, distal vessel quality, and collateral circulation; 3) approaching proximal cap ambiguity using intravascular ultrasound, retrograde, and move-the-cap techniques; 4) approaching poor distal vessel quality using the retrograde approach and bifurcation at the distal cap by use of a dual-lumen catheter and intravascular ultrasound; 5) feasibility of retrograde crossing through grafts and septal and epicardial collateral vessels; 6) antegrade wiring strategies; 7) retrograde approach; 8) changing strategy when failing to achieve progress; 9) considering performing an investment procedure if crossing attempts fail; and 10) stopping when reaching high radiation or contrast dose or in case of long procedural time, occurrence of a serious complication, operator and patient fatigue, or lack of expertise or equipment. This algorithm can improve outcomes and expand discussion, research, and collaboration
Global Chronic Total Occlusion Crossing Algorithm
The authors developed a global chronic total occlusion crossing algorithm following 10 steps: 1) dual angiography; 2) careful angiographic review focusing on proximal cap morphology, occlusion segment, distal vessel quality, and collateral circulation; 3) approaching proximal cap ambiguity using intravascular ultrasound, retrograde, and move-the-cap techniques; 4) approaching poor distal vessel quality using the retrograde approach and bifurcation at the distal cap by use of a dual-lumen catheter and intravascular ultrasound; 5) feasibility of retrograde crossing through grafts and septal and epicardial collateral vessels; 6) antegrade wiring strategies; 7) retrograde approach; 8) changing strategy when failing to achieve progress; 9) considering performing an investment procedure if crossing attempts fail; and 10) stopping when reaching high radiation or contrast dose or in case of long procedural time, occurrence of a serious complication, operator and patient fatigue, or lack of expertise or equipment. This algorithm can improve outcomes and expand discussion, research, and collaboration.info:eu-repo/semantics/publishedVersio
- …