Search CORE

222 research outputs found

Recent advances in directional statistics

Author: García-Portugués Eduardo
Pewsey Arthur
Publication venue
Publication date: 22/09/2020
Field of study

Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

arXiv.org e-Print Archive

Crossref

Universidad Carlos III de Madrid e-Archivo

Topic Modelling Meets Deep Neural Networks: A Survey

Author: Buntine Wray
Du Lan
Huynh Viet
Jin Yuan
Phung Dinh
Zhao He
Publication venue
Publication date: 01/01/2021
Field of study

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a need to summarise research developments and discuss open problems and future directions. In this paper, we provide a focused yet comprehensive overview of neural topic models for interested researchers in the AI community, so as to facilitate them to navigate and innovate in this fast-growing research area. To the best of our knowledge, ours is the first review focusing on this specific topic.Comment: A review on Neural Topic Model

arXiv.org e-Print Archive

Monash University Research Portal

spatial and temporal predictions for positive vectors

Author: Graja Omar
Publication venue
Publication date: 23/11/2020
Field of study

Predicting a given pixel from surrounding neighboring pixels is of great interest for several image processing tasks. To model images, many researchers use Gaussian distributions. However, some data are obviously non-Gaussian, such as the image clutter and texture. In such cases, predictors are hard to derive and to obtain. In this thesis, we analytically derive a new non-linear predictor based on an inverted Dirichlet mixture. The non-linear combination of the neighbouring pixels and the combination of the mixture parameters demonstrate a good efficiency in predicting pixels. In order to prove the efficacy of our predictor, we use two challenging tasks, which are; object detection and image restoration. We also develop a pixel prediction framework based on a finite generalized inverted Dirichlet (GID) mixture model that has proven its efficiency in several machine learning applications. We propose a GID optimal predictor, and we learn its parameters using a likelihood-based approach combined with the Newton-Raphson method. We demonstrate the efficiency of our proposed approach through a challenging application, namely image inpainting, and we compare the experimental results with related-work methods. Finally, we build a new time series state space model based on inverted Dirichlet distribution. We use the power steady modeling approach and we derive an analytical expression of the model latent variable using the maximum a posteriori technique. We also approximate the predictive density using local variational inference, and we validate our model on the electricity consumption time series dataset of Germany. A comparison with the Generalized Dirichlet state space model is conducted, and the results demonstrate the merits of our approach in modeling continuous positive vectors

Concordia University Research Repository

A Tutorial on Bayesian Nonparametric Models

Author: Blei David M.
Gershman Samuel J.
Publication venue
Publication date: 04/08/2011
Field of study

A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number ofclusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that side-steps this issue by allowing the data to determine the complexity of the model. This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.Comment: 28 pages, 8 figure

arXiv.org e-Print Archive

Princeton University Open Access Repository

CiteSeerX

Expanded Technical Report: Mapping Ancient Forests: Bayesian Inference for Spatio-temporal Trends in Forest Composition Using the Fossil Pollen Proxy Record

Author: McLachlan Jason S
Paciorek Christopher J.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 11/09/2008
Field of study

Collection Of Biostatistics Research Archive

Semiparametric Bayesian Density Estimation with Disparate Data Sources: A Meta-Analysis of Global Childhood Undernutrition

Author: Ezzati Majid
Finucane Mariel M.
Paciorek Christopher J.
Stevens Gretchen A.
Publication venue: 'Informa UK Limited'
Publication date: 28/06/2014
Field of study

Undernutrition, resulting in restricted growth, and quantified here using height-for-age z-scores, is an important contributor to childhood morbidity and mortality. Since all levels of mild, moderate and severe undernutrition are of clinical and public health importance, it is of interest to estimate the shape of the z-scores' distributions. We present a finite normal mixture model that uses data on 4.3 million children to make annual country-specific estimates of these distributions for under-5-year-old children in the world's 141 low- and middle-income countries between 1985 and 2011. We incorporate both individual-level data when available, as well as aggregated summary statistics from studies whose individual-level data could not be obtained. We place a hierarchical Bayesian probit stick-breaking model on the mixture weights. The model allows for nonlinear changes in time, and it borrows strength in time, in covariates, and within and across regional country clusters to make estimates where data are uncertain, sparse, or missing. This work addresses three important problems that often arise in the fields of public health surveillance and global health monitoring. First, data are always incomplete. Second, different data sources commonly use different reporting metrics. Last, distributions, and especially their tails, are often of substantive interest.Comment: 41 total pages, 6 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX