Search CORE

12 research outputs found

A Geometric Reduction Approach for Identity Testing of Reversible Markov Chains

Author: Watanabe Shun
Wolfer Geoffrey
Publication venue
Publication date: 15/02/2023
Field of study

We consider the problem of testing the identity of a reversible Markov chain against a reference from a single trajectory of observations. Employing the recently introduced notion of a lumping-congruent Markov embedding, we show that, at least in a mildly restricted setting, testing identity to a reversible chain reduces to testing to a symmetric chain over a larger state space and recover state-of-the-art sample complexity for the problem

arXiv.org e-Print Archive

The minimax risk in testing the histogram of discrete distributions for uniformity under missing ball alternatives

Author: Kipnis Alon
Publication venue
Publication date: 29/05/2023
Field of study

We consider the problem of testing the fit of a discrete sample of items from many categories to the uniform distribution over the categories. As a class of alternative hypotheses, we consider the removal of an

\ell_p

ball of radius

\epsilon

around the uniform rate sequence for

p \leq 2

. We deliver a sharp characterization of the asymptotic minimax risk when

\epsilon \to 0

as the number of samples and number of dimensions go to infinity, for testing based on the occurrences' histogram (number of absent categories, singletons, collisions, ...). For example, for

p=1

and in the limit of a small expected number of samples

n

compared to the number of categories

N

(aka "sub-linear" regime), the minimax risk

R^*_\epsilon

asymptotes to

2 \bar{\Phi}\left(n \epsilon^2/\sqrt{8N}\right)

, with

\bar{\Phi}(x)

the normal survival function. Empirical studies over a range of problem parameters show that this estimate is accurate in finite samples, and that our test is significantly better than the chisquared test or a test that only uses collisions. Our analysis is based on the asymptotic normality of histogram ordinates, the equivalence between the minimax setting to a Bayesian one, and the reduction of a multi-dimensional optimization problem to a one-dimensional problem

arXiv.org e-Print Archive