Search CORE

18,151 research outputs found

Automatic Bayesian Density Analysis

Author: Ghahramani Zoubin
Kersting Kristian
Molina Alejandro
Peharz Robert
Valera Isabel
Vergari Antonio
Publication venue
Publication date: 01/01/2019
Field of study

Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

arXiv.org e-Print Archive

TUbiblio

Pure OAI Repository

MPG.PuRe

Association for the Advancement of Artificial Intelligence: AAAI Publications

High-Dimensional Bayesian Geostatistics

Author: Banerjee Sudipto
Publication venue
Publication date: 01/01/2017
Field of study

With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has

\sim n

floating point operations (flops), where

n

the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

arXiv.org e-Print Archive

Ezid

eScholarship - University of California

Multi-Target Prediction: A Unifying View on Problems and Methods

Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

Mind the Gap: Subspace based Hierarchical Domain Adaptation

Author: Namboodiri Vinay P.
Raj Anant
Tuytelaars Tinne
Publication venue
Publication date: 16/01/2015
Field of study

Domain adaptation techniques aim at adapting a classifier learnt on a source domain to work on the target domain. Exploiting the subspaces spanned by features of the source and target domains respectively is one approach that has been investigated towards solving this problem. These techniques normally assume the existence of a single subspace for the entire source / target domain. In this work, we consider the hierarchical organization of the data and consider multiple subspaces for the source and target domain based on the hierarchy. We evaluate different subspace based domain adaptation techniques under this setting and observe that using different subspaces based on the hierarchy yields consistent improvement over a non-hierarchical baselineComment: 4 pages in Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice in NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Bayesian modeling of networks in complex business intelligence problems

Author: Airoldi
Aldous
Azzalini
Banerjee
Bhattacharya
Bigelow
Dunson
Escobar
Fruchterman
Gershman
Griffiths
Hjort
Hoff
Hoff
Kaishev
Kamakura
Kamakura
Lau
Matiş
Medvedovic
Neal
Nowicki
Polson
Rousseau
Stephens
Thuring
Thuring
Verhoef
Publication venue: 'Wiley'
Publication date: 28/03/2016
Field of study

Complex network data problems are increasingly common in many fields of application. Our motivation is drawn from strategic marketing studies monitoring customer choices of specific products, along with co-subscription networks encoding multiple purchasing behavior. Data are available for several agencies within the same insurance company, and our goal is to efficiently exploit co-subscription networks to inform targeted advertising of cross-sell strategies to currently mono-product customers. We address this goal by developing a Bayesian hierarchical model, which clusters agencies according to common mono-product customer choices and co-subscription networks. Within each cluster, we efficiently model customer behavior via a cluster-dependent mixture of latent eigenmodels. This formulation provides key information on mono-product customer choices and multiple purchasing behavior within each cluster, informing targeted cross-sell strategies. We develop simple algorithms for tractable inference, and assess performance in simulations and an application to business intelligence

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

Crossref

Archivio istituzionale della ricerca - Università di Padova

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Author: Brochu Eric
Cora Vlad M.
de Freitas Nando
Publication venue
Publication date: 01/01/2009
Field of study

We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments---active user modelling with preferences, and hierarchical reinforcement learning---and a discussion of the pros and cons of Bayesian optimization based on our experiences

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive