1,820 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Leveraging Node Attributes for Incomplete Relational Data
Relational data are usually highly incomplete in practice, which inspires us
to leverage side information to improve the performance of community detection
and link prediction. This paper presents a Bayesian probabilistic approach that
incorporates various kinds of node attributes encoded in binary form in
relational models with Poisson likelihood. Our method works flexibly with both
directed and undirected relational networks. The inference can be done by
efficient Gibbs sampling which leverages sparsity of both networks and node
attributes. Extensive experiments show that our models achieve the
state-of-the-art link prediction results, especially with highly incomplete
relational data.Comment: Appearing in ICML 201
Modeling Individual Cyclic Variation in Human Behavior
Cycles are fundamental to human health and behavior. However, modeling cycles
in time series data is challenging because in most cases the cycles are not
labeled or directly observed and need to be inferred from multidimensional
measurements taken over time. Here, we present CyHMMs, a cyclic hidden Markov
model method for detecting and modeling cycles in a collection of
multidimensional heterogeneous time series data. In contrast to previous cycle
modeling methods, CyHMMs deal with a number of challenges encountered in
modeling real-world cycles: they can model multivariate data with discrete and
continuous dimensions; they explicitly model and are robust to missing data;
and they can share information across individuals to model variation both
within and between individual time series. Experiments on synthetic and
real-world health-tracking data demonstrate that CyHMMs infer cycle lengths
more accurately than existing methods, with 58% lower error on simulated data
and 63% lower error on real-world data compared to the best-performing
baseline. CyHMMs can also perform functions which baselines cannot: they can
model the progression of individual features/symptoms over the course of the
cycle, identify the most variable features, and cluster individual time series
into groups with distinct characteristics. Applying CyHMMs to two real-world
health-tracking datasets -- of menstrual cycle symptoms and physical activity
tracking data -- yields important insights including which symptoms to expect
at each point during the cycle. We also find that people fall into several
groups with distinct cycle patterns, and that these groups differ along
dimensions not provided to the model. For example, by modeling missing data in
the menstrual cycles dataset, we are able to discover a medically relevant
group of birth control users even though information on birth control is not
given to the model.Comment: Accepted at WWW 201
- …