Search CORE

6 research outputs found

Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours

Author: Browne Ryan P.
Franczak Brian C.
McNicholas Paul D.
Tortora Cristina
Publication venue: 'Elsevier BV'
Publication date: 17/09/2014
Field of study

Mixture models whose components have skewed hypercube contours are developed via a generalization of the multivariate shifted asymmetric Laplace density. Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace distributions. The component densities have two unique features: they include a multivariate weight function, and the marginal distributions are also asymmetric Laplace. We use these mixtures of multiple scaled shifted asymmetric Laplace distributions for clustering applications, but they could equally well be used in the supervised or semi-supervised paradigms. The expectation-maximization algorithm is used for parameter estimation and the Bayesian information criterion is used for model selection. Simulated and real data sets are used to illustrate the approach and, in some cases, to visualize the skewed hypercube structure of the components

arXiv.org e-Print Archive

Multivariate, Heteroscedastic Empirical Bayes via Nonparametric Maximum Likelihood

Author: Guntuboyina Adityanand
Sen Bodhisattva
Soloff Jake A.
Publication venue
Publication date: 29/12/2023
Field of study

Multivariate, heteroscedastic errors complicate statistical inference in many large-scale denoising problems. Empirical Bayes is attractive in such settings, but standard parametric approaches rest on assumptions about the form of the prior distribution which can be hard to justify and which introduce unnecessary tuning parameters. We extend the nonparametric maximum likelihood estimator (NPMLE) for Gaussian location mixture densities to allow for multivariate, heteroscedastic errors. NPMLEs estimate an arbitrary prior by solving an infinite-dimensional, convex optimization problem; we show that this convex optimization problem can be tractably approximated by a finite-dimensional version. The empirical Bayes posterior means based on an NPMLE have low regret, meaning they closely target the oracle posterior means one would compute with the true prior in hand. We prove an oracle inequality implying that the empirical Bayes estimator performs at nearly the optimal level (up to logarithmic factors) for denoising without prior knowledge. We provide finite-sample bounds on the average Hellinger accuracy of an NPMLE for estimating the marginal densities of the observations. We also demonstrate the adaptive and nearly-optimal properties of NPMLEs for deconvolution. We apply our method to two denoising problems in astronomy, constructing a fully data-driven color-magnitude diagram of 1.4 million stars in the Milky Way and investigating the distribution of 19 chemical abundance ratios for 27 thousand stars in the red clump. We also apply our method to hierarchical linear models, illustrating the advantages of nonparametric shrinkage of regression coefficients on an education data set and on a microarray data set

arXiv.org e-Print Archive

Subspace coverings with multiplicities

Author: Bishnoi Anurag
Boyadzhiyska Simona
Das Shagnik
Mészáros Tamás
Publication venue
Publication date: 28/01/2021
Field of study

We study the problem of determining the minimum number

f(n,k,d)

of affine subspaces of codimension

d

that are required to cover all points of

\mathbb{F}_2^n\setminus \{\vec{0}\}

at least

k

times while covering the origin at most

k-1

times. The case

k=1

is a classic result of Jamison, which was independently obtained by Brouwer and Schrijver for

d = 1

. The value of

f(n,1,1)

also follows from a well-known theorem of Alon and F\"uredi about coverings of finite grids in affine spaces over arbitrary fields. Here we determine the value of this function exactly in various ranges of the parameters. In particular, we prove that for

k \ge 2^{n-d-1}

we have

f(n,k,d)=2^d k - \left \lfloor \frac{k}{2^{n-d}} \right \rfloor

, while for

n > 2^{2^d k-k-d+1}

we have

f(n,k,d)= n + 2^dk-d-2

, and also study the transition between these two ranges. While previous work in this direction has primarily employed the polynomial method, we prove our results through more direct combinatorial and probabilistic arguments, and also exploit a connection to coding theory.Comment: 15 page

arXiv.org e-Print Archive

Institutional Repository of the Freie Universität Berlin

TU Delft Repository

University of Birmingham Research Portal

Space programs summary no. 37-49, volume 3 for the period December 1, 1967 to January 30, 1968. Supporting research and advanced development

Author
Publication venue
Publication date
Field of study

Space program research projects on systems analysis and engineering, telecommunications, guidance and control, propulsion, and data system

NASA Technical Reports Server