81,048 research outputs found
Iterative Random Forests to detect predictive and stable high-order interactions
Genomics has revolutionized biology, enabling the interrogation of whole
transcriptomes, genome-wide binding sites for proteins, and many other
molecular processes. However, individual genomic assays measure elements that
interact in vivo as components of larger molecular machines. Understanding how
these high-order interactions drive gene expression presents a substantial
statistical challenge. Building on Random Forests (RF), Random Intersection
Trees (RITs), and through extensive, biologically inspired simulations, we
developed the iterative Random Forest algorithm (iRF). iRF trains a
feature-weighted ensemble of decision trees to detect stable, high-order
interactions with same order of computational cost as RF. We demonstrate the
utility of iRF for high-order interaction discovery in two prediction problems:
enhancer activity in the early Drosophila embryo and alternative splicing of
primary transcripts in human derived cell lines. In Drosophila, among the 20
pairwise transcription factor interactions iRF identifies as stable (returned
in more than half of bootstrap replicates), 80% have been previously reported
as physical interactions. Moreover, novel third-order interactions, e.g.
between Zelda (Zld), Giant (Gt), and Twist (Twi), suggest high-order
relationships that are candidates for follow-up experiments. In human-derived
cells, iRF re-discovered a central role of H3K36me3 in chromatin-mediated
splicing regulation, and identified novel 5th and 6th order interactions,
indicative of multi-valent nucleosomes with specific roles in splicing
regulation. By decoupling the order of interactions from the computational cost
of identification, iRF opens new avenues of inquiry into the molecular
mechanisms underlying genome biology
VAE with a VampPrior
Many different methods to train deep generative models have been introduced
in the past. In this paper, we propose to extend the variational auto-encoder
(VAE) framework with a new type of prior which we call "Variational Mixture of
Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture
distribution (e.g., a mixture of Gaussians) with components given by
variational posteriors conditioned on learnable pseudo-inputs. We further
extend this prior to a two layer hierarchical model and show that this
architecture with a coupled prior and posterior, learns significantly better
models. The model also avoids the usual local optima issues related to useless
latent dimensions that plague VAEs. We provide empirical studies on six
datasets, namely, static and binary MNIST, OMNIGLOT, Caltech 101 Silhouettes,
Frey Faces and Histopathology patches, and show that applying the hierarchical
VampPrior delivers state-of-the-art results on all datasets in the unsupervised
permutation invariant setting and the best results or comparable to SOTA
methods for the approach with convolutional networks.Comment: 16 pages, final version, AISTATS 201
Improving Task-Parameterised Movement Learning Generalisation with Frame-Weighted Trajectory Generation
Learning from Demonstration depends on a robot learner generalising its
learned model to unseen conditions, as it is not feasible for a person to
provide a demonstration set that accounts for all possible variations in
non-trivial tasks. While there are many learning methods that can handle
interpolation of observed data effectively, extrapolation from observed data
offers a much greater challenge. To address this problem of generalisation,
this paper proposes a modified Task-Parameterised Gaussian Mixture Regression
method that considers the relevance of task parameters during trajectory
generation, as determined by variance in the data. The benefits of the proposed
method are first explored using a simulated reaching task data set. Here it is
shown that the proposed method offers far-reaching, low-error extrapolation
abilities that are different in nature to existing learning methods. Data
collected from novice users for a real-world manipulation task is then
considered, where it is shown that the proposed method is able to effectively
reduce grasping performance errors by and extrapolate to unseen
grasp targets under real-world conditions. These results indicate the proposed
method serves to benefit novice users by placing less reliance on the user to
provide high quality demonstration data sets.Comment: 8 pages, 6 figures, submitted to 2019 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS
Augment-and-Conquer Negative Binomial Processes
By developing data augmentation methods unique to the negative binomial (NB)
distribution, we unite seemingly disjoint count and mixture models under the NB
process framework. We develop fundamental properties of the models and derive
efficient Gibbs sampling inference. We show that the gamma-NB process can be
reduced to the hierarchical Dirichlet process with normalization, highlighting
its unique theoretical, structural and computational advantages. A variety of
NB processes with distinct sharing mechanisms are constructed and applied to
topic modeling, with connections to existing algorithms, showing the importance
of inferring both the NB dispersion and probability parameters.Comment: Neural Information Processing Systems, NIPS 201
- …