4,073 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Text Classification of installation Support Contract Topic Models for Category Management
Air Force Installation Contracting Agency manages nearly 18 percent of total Air Force spend, equating to approximately 57 billion dollars. To improve strategic sourcing, the organization is beginning to categorize installation-support spend and assign accountable portfolio managers to respective spend categories. A critical task in this new strategic environment includes the appropriate categorization of Air Force contracts into newly created, manageable spend categories. It has been recognized that current composite categories have the opportunity to be further distinguished into sub-categories leveraging text analytics on the contract descriptions. Furthermore, upon establishing newly constructed categories, future contracts must be classified into these newly constructed categories in order to be strategically managed. This research proposes a methodological framework for using Latent Dirichlet Allocation to sculpt categories from the natural distribution of contract topics, and assesses the appropriateness of supervised learning classification algorithms such as Support Vector Machines, Random Forests, and Weighted K-Nearest Neighbors models to classify future unseen contracts. The results suggest a significant improvement in modeled spend categories over the existing categories, facilitating more accurate classification of unseen contracts into their respective sub-categories
Action classification using a discriminative multilevel HDP-HMM
We classify human actions occurring in depth image sequences using features based on skeletal joint positions. The action classes are represented by a multi-level Hierarchical Dirichlet Process – Hidden Markov Model (HDP-HMM). The non-parametric HDP-HMM allows the inference of hidden states automatically from training data. The model parameters of each class are formulated as transformations from a shared base distribution, thus promoting the use of unlabelled examples during training and borrowing information across action classes. Further, the parameters are learnt in a discriminative way. We use a normalized gamma process representation of HDP and margin based likelihood functions for this purpose. We sample parameters from the complex posterior distribution induced by our discriminative likelihood function using elliptical slice sampling. Experiments with two different datasets show that action class models learnt using our technique produce good classification results
Combinatorial clustering and the beta negative binomial process
We develop a Bayesian nonparametric approach to a general family of latent
class problems in which individuals can belong simultaneously to multiple
classes and where each class can be exhibited multiple times by an individual.
We introduce a combinatorial stochastic process known as the negative binomial
process (NBP) as an infinite-dimensional prior appropriate for such problems.
We show that the NBP is conjugate to the beta process, and we characterize the
posterior distribution under the beta-negative binomial process (BNBP) and
hierarchical models based on the BNBP (the HBNBP). We study the asymptotic
properties of the BNBP and develop a three-parameter extension of the BNBP that
exhibits power-law behavior. We derive MCMC algorithms for posterior inference
under the HBNBP, and we present experiments using these algorithms in the
domains of image segmentation, object recognition, and document analysis.Comment: 56 pages, 4 figures, 6 table
- …