9,543 research outputs found
Bayesian multivariate mixed-scale density estimation
Although continuous density estimation has received abundant attention in the
Bayesian nonparametrics literature, there is limited theory on multivariate
mixed scale density estimation. In this note, we consider a general framework
to jointly model continuous, count and categorical variables under a
nonparametric prior, which is induced through rounding latent variables having
an unknown density with respect to Lebesgue measure. For the proposed class of
priors, we provide sufficient conditions for large support, strong consistency
and rates of posterior contraction. These conditions allow one to convert
sufficient conditions obtained in the setting of multivariate continuous
density estimation to the mixed scale case. To illustrate the procedure a
rounded multivariate nonparametric mixture of Gaussians is introduced and
applied to a crime and communities dataset
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Inconsistency of Pitman-Yor process mixtures for the number of components
In many applications, a finite mixture is a natural model, but it can be
difficult to choose an appropriate number of components. To circumvent this
choice, investigators are increasingly turning to Dirichlet process mixtures
(DPMs), and Pitman-Yor process mixtures (PYMs), more generally. While these
models may be well-suited for Bayesian density estimation, many investigators
are using them for inferences about the number of components, by considering
the posterior on the number of components represented in the observed data. We
show that this posterior is not consistent --- that is, on data from a finite
mixture, it does not concentrate at the true number of components. This result
applies to a large class of nonparametric mixtures, including DPMs and PYMs,
over a wide variety of families of component distributions, including
essentially all discrete families, as well as continuous exponential families
satisfying mild regularity conditions (such as multivariate Gaussians).Comment: This is a general treatment of the problem discussed in our related
article, "A simple example of Dirichlet process mixture inconsistency for the
number of components", Miller and Harrison (2013) arXiv:1301.270
On approximating copulas by finite mixtures
Copulas are now frequently used to approximate or estimate multivariate
distributions because of their ability to take into account the multivariate
dependence of the variables while controlling the approximation properties of
the marginal densities. Copula based multivariate models can often also be more
parsimonious than fitting a flexible multivariate model, such as a mixture of
normals model, directly to the data. However, to be effective, it is imperative
that the family of copula models considered is sufficiently flexible. Although
finite mixtures of copulas have been used to construct flexible families of
copulas, their approximation properties are not well understood and we show
that natural candidates such as mixtures of elliptical copulas and mixtures of
Archimedean copulas cannot approximate a general copula arbitrarily well. Our
article develops fundamental tools for approximating a general copula
arbitrarily well by a mixture and proposes a family of finite mixtures that can
do so. We illustrate empirically on a financial data set that our approach for
estimating a copula can be much more parsimonious and results in a better fit
than approximating the copula by a mixture of normal copulas.Comment: 26 pages and 1 figure and 2 table
- …