69,379 research outputs found
On Probability Estimation via Relative Frequencies and Discount
Probability estimation is an elementary building block of every statistical
data compression algorithm. In practice probability estimation is often based
on relative letter frequencies which get scaled down, when their sum is too
large. Such algorithms are attractive in terms of memory requirements, running
time and practical performance. However, there still is a lack of theoretical
understanding. In this work we formulate a typical probability estimation
algorithm based on relative frequencies and frequency discount, Algorithm RFD.
Our main contribution is its theoretical analysis. We show that the code length
it requires above an arbitrary piecewise stationary model with bounded and
unbounded letter probabilities is small. This theoretically confirms the
recency effect of periodic frequency discount, which has often been observed
empirically
Divergence rates of Markov order estimators and their application to statistical estimation of stationary ergodic processes
Stationary ergodic processes with finite alphabets are estimated by finite
memory processes from a sample, an n-length realization of the process, where
the memory depth of the estimator process is also estimated from the sample
using penalized maximum likelihood (PML). Under some assumptions on the
continuity rate and the assumption of non-nullness, a rate of convergence in
-distance is obtained, with explicit constants. The result requires an
analysis of the divergence of PML Markov order estimators for not necessarily
finite memory processes. This divergence problem is investigated in more
generality for three information criteria: the Bayesian information criterion
with generalized penalty term yielding the PML, and the normalized maximum
likelihood and the Krichevsky-Trofimov code lengths. Lower and upper bounds on
the estimated order are obtained. The notion of consistent Markov order
estimation is generalized for infinite memory processes using the concept of
oracle order estimates, and generalized consistency of the PML Markov order
estimator is presented.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ468 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Gaussian semi-parametric estimation of fractional cointegration
We analyse consistent estimation of the memory parameters of a nonstationary fractionally cointegrated vector time series. Assuming that the cointegrating relationship has substantially less memory than the observed series, we show that a multi-variate Gaussian semi-parametric estimate, based on initial consistent estimates and possibly tapered observations, is asymptotically normal. The estimates of the memory parameters can rely either on original (for stationary errors) or on differenced residuals (for nonstationary errors) assuming only a convergence rate for a preliminary slope estimate. If this rate is fast enough, semi-parametric memory estimates are not affected by the use of residuals and retain the same asymptotic distribution as if the true cointegrating relationship were known. Only local conditions on the spectral densities around zero frequency for linear processes are assumed. We concentrate on a bivariate system but discuss multi-variate generalizations and show the performance of the estimates with simulated and real data.Publicad
Randomized Sketches of Convex Programs with Sharp Guarantees
Random projection (RP) is a classical technique for reducing storage and
computational costs. We analyze RP-based approximations of convex programs, in
which the original optimization problem is approximated by the solution of a
lower-dimensional problem. Such dimensionality reduction is essential in
computation-limited settings, since the complexity of general convex
programming can be quite high (e.g., cubic for quadratic programs, and
substantially higher for semidefinite programs). In addition to computational
savings, random projection is also useful for reducing memory usage, and has
useful properties for privacy-sensitive optimization. We prove that the
approximation ratio of this procedure can be bounded in terms of the geometry
of constraint set. For a broad class of random projections, including those
based on various sub-Gaussian distributions as well as randomized Hadamard and
Fourier transforms, the data matrix defining the cost function can be projected
down to the statistical dimension of the tangent cone of the constraints at the
original solution, which is often substantially smaller than the original
dimension. We illustrate consequences of our theory for various cases,
including unconstrained and -constrained least squares, support vector
machines, low-rank matrix estimation, and discuss implications on
privacy-sensitive optimization and some connections with de-noising and
compressed sensing
- …