Search CORE

2,736 research outputs found

Adaptive Threshold Sampling and Estimation

Author: Ting Daniel
Publication venue
Publication date: 16/08/2017
Field of study

Sampling is a fundamental problem in both computer science and statistics. A number of issues arise when designing a method based on sampling. These include statistical considerations such as constructing a good sampling design and ensuring there are good, tractable estimators for the quantities of interest as well as computational considerations such as designing fast algorithms for streaming data and ensuring the sample fits within memory constraints. Unfortunately, existing sampling methods are only able to address all of these issues in limited scenarios. We develop a framework that can be used to address these issues in a broad range of scenarios. In particular, it addresses the problem of drawing and using samples under some memory budget constraint. This problem can be challenging since the memory budget forces samples to be drawn non-independently and consequently, makes computation of resulting estimators difficult. At the core of the framework is the notion of a data adaptive thresholding scheme where the threshold effectively allows one to treat the non-independent sample as if it were drawn independently. We provide sufficient conditions for a thresholding scheme to allow this and provide ways to build and compose such schemes. Furthermore, we provide fast algorithms to efficiently sample under these thresholding schemes

arXiv.org e-Print Archive

Multiscale likelihood analysis and complexity penalized estimation

Author: Kolaczyk Eric D.
Nowak Robert D.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2004
Field of study

We describe here a framework for a certain class of multiscale likelihood factorizations wherein, in analogy to a wavelet decomposition of an L^2 function, a given likelihood function has an alternative representation as a product of conditional densities reflecting information in both the data and the parameter vector localized in position and scale. The framework is developed as a set of sufficient conditions for the existence of such factorizations, formulated in analogy to those underlying a standard multiresolution analysis for wavelets, and hence can be viewed as a multiresolution analysis for likelihoods. We then consider the use of these factorizations in the task of nonparametric, complexity penalized likelihood estimation. We study the risk properties of certain thresholding and partitioning estimators, and demonstrate their adaptivity and near-optimality, in a minimax sense over a broad range of function spaces, based on squared Hellinger distance as a loss function. In particular, our results provide an illustration of how properties of classical wavelet-based estimators can be obtained in a single, unified framework that includes models for continuous, count and categorical data types

arXiv.org e-Print Archive

CiteSeerX

Crossref

A model of Poissonian interactions and detection of dependence

Author: Sansonnet Laure
Tuleau-Malot Christine
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper proposes a model of interactions between two point processes, ruled by a reproduction function h, which is considered as the intensity of a Poisson process. In particular, we focus on the context of neurosciences to detect possible interactions in the cerebral activity associated with two neurons. To provide a mathematical answer to this specific problem of neurobiologists, we address so the question of testing the nullity of the intensity h. We construct a multiple testing procedure obtained by the aggregation of single tests based on a wavelet thresholding method. This test has good theoretical properties: it is possible to guarantee the level but also the power under some assumptions and its uniform separation rate over weak Besov bodies is adaptive minimax. Then, some simulations are provided, showing the good practical behavior of our testing procedure.Comment: 27 page

arXiv.org e-Print Archive

HAL-UNICE

HAL Descartes

On adaptive wavelet estimation of a class of weighted densities

Author: Cai T.
Chesneau C.
Christophe Chesneau
Fabien Navarro
Jalal Fadili
Mallat S.
Meyer Y.
Nason G.
Rudemo M.
Publication venue: 'Informa UK Limited'
Publication date: 28/02/2013
Field of study

We investigate the estimation of a weighted density taking the form

g=w(F)f

, where

f

denotes an unknown density,

F

the associated distribution function and

w

is a known (non-negative) weight. Such a class encompasses many examples, including those arising in order statistics or when

g

is related to the maximum or the minimum of

N

(random or fixed) independent and identically distributed (\iid) random variables. We here construct a new adaptive non-parametric estimator for

g

based on a plug-in approach and the wavelets methodology. For a wide class of models, we prove that it attains fast rates of convergence under the

\mathbb{L}_p

risk with

p\ge 1

(not only for

p = 2

corresponding to the mean integrated squared error) over Besov balls. The theoretical findings are illustrated through several simulations

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

Super-resolution community detection for layer-aggregated multilayer networks

Author: Caceres Rajmonda S.
Mucha Peter J.
Taylor Dane
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2017
Field of study

Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the tradeoffs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with

N

nodes and

L

layers, which are drawn from an ensemble of Erd\H{o}s-R\'enyi networks. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit

K^*

. When layers are aggregated via a summation, we obtain

K^*\varpropto \mathcal{O}(\sqrt{NL}/T)

, where

T

is the number of layers across which the community persists. Interestingly, if

T

is allowed to vary with

L

then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that

T/L

decays more slowly than

\mathcal{O}(L^{-1/2})

. Moreover, we find that thresholding the summation can in some cases cause

K^*

to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. That is, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.Comment: 11 pages, 8 figure

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Directory of Open Access Journals

Carolina Digital Repository

Automatic Kalman-filter-based wavelet shrinkage denoising of 1D stellar spectra

Author: Gilda S
Slepian Z
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

We propose a non-parametric method to denoise 1D stellar spectra based on wavelet shrinkage followed by adaptive Kalman thresholding. Wavelet shrinkage denoising involves applying the discrete wavelet transform (DWT) to the input signal, 'shrinking' certain frequency components in the transform domain, and then applying inverse DWT to the reduced components. The performance of this procedure is influenced by the choice of base wavelet, the number of decomposition levels, and the thresholding function. Typically, these parameters are chosen by 'trial and error', which can be strongly dependent on the properties of the data being denoised. We here introduce an adaptive Kalman-filter-based thresholding method that eliminates the need for choosing the number of decomposition levels. We use the 'Haar' wavelet basis, which we found to provide excellent filtering for 1D stellar spectra, at a low computational cost. We introduce various levels of Poisson noise into synthetic PHOENIX spectra, and test the performance of several common denoising methods against our own. It proves superior in terms of noise suppression and peak shape preservation. We expect it may also be of use in automatically and accurately filtering low signal-to-noise galaxy and quasar spectra obtained from surveys such as SDSS, Gaia, LSST, PESSTO, VANDELS, LEGA-C, and DESI

arXiv.org e-Print Archive

eScholarship - University of California