74 research outputs found

    Adaptive density estimation for stationary processes

    Get PDF
    We propose an algorithm to estimate the common density ss of a stationary process X1,...,XnX_1,...,X_n. We suppose that the process is either ÎČ\beta or τ\tau-mixing. We provide a model selection procedure based on a generalization of Mallows' CpC_p and we prove oracle inequalities for the selected estimator under a few prior assumptions on the collection of models and on the mixing coefficients. We prove that our estimator is adaptive over a class of Besov spaces, namely, we prove that it achieves the same rates of convergence as in the i.i.d framework

    Rho-estimators for shape restricted density estimation

    Get PDF
    International audienceThe purpose of this paper is to pursue our study of ρ-estimators built from i.i.d. observations that we defined in Baraud et al. (2014). For a ρ-estimator based on some model S (which means that the estimator belongs to S) and a true distribution of the observations that also belongs to S, the risk (with squared Hellinger loss) is bounded by a quantity which can be viewed as a dimension function of the model and is often related to the “metric dimension” of this model, as defined in Birg ́e (2006). This is a minimax point of view and it is well-known that it is pessimistic. Typically, the bound is accurate for most points in the model but may be very pessimistic when the true distribution belongs to some specific part of it. This is the situation that we want to investigate here. For some models, like the set of decreasing densities on [0, 1], there exist specific points in the model that we shall call extremal and for which the risk is substantially smaller than the typical risk. Moreover, the risk at a non-extremal point of the model can be bounded by the sum of the risk bound at a well-chosen extremal point plus the square of its distance to this point. This implies that if the true density is close enough to an extremal point, the risk at this point may be smaller than the minimax risk on the model and this actually remains true even if the true density does not belong to the model. The result is based on some refined bounds on the suprema of empirical processes that are established in Baraud (2016)

    Testing probability distributions underlying aggregated data

    Full text link
    In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution DD over [n][n]. More precisely, we define both the dual and cumulative dual access models, in which the algorithm AA can both sample from DD and respectively, for any i∈[n]i\in[n], - query the probability mass D(i)D(i) (query access); or - get the total mass of {1,
,i}\{1,\dots,i\}, i.e. ∑j=1iD(j)\sum_{j=1}^i D(j) (cumulative access) These two models, by generalizing the previously studied sampling and query oracle models, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting aspects of these problems -- and providing new insight on the limitations of the models. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power

    Adaptive management for ecosystem services

    Get PDF
    Management of natural resources for the production of ecosystem services, which are vital for human well-being, is necessary even when there is uncertainty regarding system response to management action. This uncertainty is the result of incomplete controllability, complex internal feedbacks, and nonlinearity that often interferes with desired management outcomes, and insufficient understanding of nature and people. Adaptive management was developed to reduce such uncertainty. We present a framework for the application of adaptive management for ecosystem services that explicitly accounts for cross-scale tradeoffs in the production of ecosystem services. Our framework focuses on identifying key spatiotemporal scales (plot, patch, ecosystem, landscape, and region) that encompass dominant structures and processes in the system, and includes within- and cross-scale dynamics, ecosystem service tradeoffs, and management controllability within and across scales. Resilience theory recognizes that a limited set of ecological processes in a given system regulate ecosystem services, yet our understanding of these processes is poorly understood. If management actions erode or remove these processes, the system may shift into an alternative state unlikely to support the production of desired services. Adaptive management provides a process to assess the underlying within and cross-scale tradeoffs associated with production of ecosystem services while proceeding with management designed to meet the demands of a growing human population

    Adaptive management for ecosystem services

    Get PDF
    Management of natural resources for the production of ecosystem services, which are vital for human well-being, is necessary even when there is uncertainty regarding system response to management action. This uncertainty is the result of incomplete controllability, complex internal feedbacks, and nonlinearity that often interferes with desired management outcomes, and insufficient understanding of nature and people. Adaptive management was developed to reduce such uncertainty. We present a framework for the application of adaptive management for ecosystem services that explicitly accounts for cross-scale tradeoffs in the production of ecosystem services. Our framework focuses on identifying key spatiotemporal scales (plot, patch, ecosystem, landscape, and region) that encompass dominant structures and processes in the system, and includes within- and cross-scale dynamics, ecosystem service tradeoffs, and management controllability within and across scales. Resilience theory recognizes that a limited set of ecological processes in a given system regulate ecosystem services, yet our understanding of these processes is poorly understood. If management actions erode or remove these processes, the system may shift into an alternative state unlikely to support the production of desired services. Adaptive management provides a process to assess the underlying within and cross-scale tradeoffs associated with production of ecosystem services while proceeding with management designed to meet the demands of a growing human population

    Learning Poisson Binomial Distributions

    Get PDF
    We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over {0,1,
,n}\{0,1,\dots,n\} is the distribution of a sum of nn independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural nn-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to \eps-accuracy (with respect to the total variation distance) using \tilde{O}(1/\eps^3) samples \emph{independent of nn}. The running time of the algorithm is \emph{quasilinear} in the size of its input data, i.e., \tilde{O}(\log(n)/\eps^3) bit-operations. (Observe that each draw from the distribution is a log⁡(n)\log(n)-bit string.) Our second main result is a {\em proper} learning algorithm that learns to \eps-accuracy using \tilde{O}(1/\eps^2) samples, and runs in time (1/\eps)^{\poly (\log (1/\eps))} \cdot \log n. This is nearly optimal, since any algorithm {for this problem} must use \Omega(1/\eps^2) samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables.Comment: Revised full version. Improved sample complexity bound of O~(1/eps^2

    Low Complexity Regularization of Linear Inverse Problems

    Full text link
    Inverse problems and regularization theory is a central theme in contemporary signal processing, where the goal is to reconstruct an unknown signal from partial indirect, and possibly noisy, measurements of it. A now standard method for recovering the unknown signal is to solve a convex optimization problem that enforces some prior knowledge about its structure. This has proved efficient in many problems routinely encountered in imaging sciences, statistics and machine learning. This chapter delivers a review of recent advances in the field where the regularization prior promotes solutions conforming to some notion of simplicity/low-complexity. These priors encompass as popular examples sparsity and group sparsity (to capture the compressibility of natural signals and images), total variation and analysis sparsity (to promote piecewise regularity), and low-rank (as natural extension of sparsity to matrix-valued data). Our aim is to provide a unified treatment of all these regularizations under a single umbrella, namely the theory of partial smoothness. This framework is very general and accommodates all low-complexity regularizers just mentioned, as well as many others. Partial smoothness turns out to be the canonical way to encode low-dimensional models that can be linear spaces or more general smooth manifolds. This review is intended to serve as a one stop shop toward the understanding of the theoretical properties of the so-regularized solutions. It covers a large spectrum including: (i) recovery guarantees and stability to noise, both in terms of ℓ2\ell^2-stability and model (manifold) identification; (ii) sensitivity analysis to perturbations of the parameters involved (in particular the observations), with applications to unbiased risk estimation ; (iii) convergence properties of the forward-backward proximal splitting scheme, that is particularly well suited to solve the corresponding large-scale regularized optimization problem

    Wavelet penalized likelihood estimation in generalized functional models

    Full text link
    The paper deals with generalized functional regression. The aim is to estimate the influence of covariates on observations, drawn from an exponential distribution. The link considered has a semiparametric expression: if we are interested in a functional influence of some covariates, we authorize others to be modeled linearly. We thus consider a generalized partially linear regression model with unknown regression coefficients and an unknown nonparametric function. We present a maximum penalized likelihood procedure to estimate the components of the model introducing penalty based wavelet estimators. Asymptotic rates of the estimates of both the parametric and the nonparametric part of the model are given and quasi-minimax optimality is obtained under usual conditions in literature. We establish in particular that the LASSO penalty leads to an adaptive estimation with respect to the regularity of the estimated function. An algorithm based on backfitting and Fisher-scoring is also proposed for implementation. Simulations are used to illustrate the finite sample behaviour, including a comparison with kernel and splines based methods
    • 

    corecore