Search CORE

4 research outputs found

Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights

Author: Doshi-velez Finale
Ghosh Soumya
Pan Weiwei
Pradier Melanie F.
Yao Jiayu
Publication venue
Publication date: 12/06/2019
Field of study

As machine learning systems get widely adopted for high-stake decisions, quantifying uncertainty over predictions becomes crucial. While modern neural networks are making remarkable gains in terms of predictive accuracy, characterizing uncertainty over the parameters of these models is challenging because of the high dimensionality and complex correlations of the network parameter space. This paper introduces a novel variational inference framework for Bayesian neural networks that (1) encodes complex distributions in high-dimensional parameter space with representations in a low-dimensional latent space, and (2) performs inference efficiently on the low-dimensional representations. Across a large array of synthetic and real-world datasets, we show that our method improves uncertainty characterization and model generalization when compared with methods that work directly in the parameter space

arXiv.org e-Print Archive

Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints

Author: Abdrakhmanova Madina
Chen Xiaoyin
Hope Gabriel
Hughes Michael C.
Hughes Michael C.
Sudderth Erik B.
Publication venue
Publication date: 11/12/2020
Field of study

We develop a new framework for learning variational autoencoders and other deep generative models that balances generative and discriminative goals. Our framework optimizes model parameters to maximize a variational lower bound on the likelihood of observed data, subject to a task-specific prediction constraint that prevents model misspecification from leading to inaccurate predictions. We further enforce a consistency constraint, derived naturally from the generative model, that requires predictions on reconstructed data to match those on the original data. We show that these two contributions -- prediction constraints and consistency constraints -- lead to promising image classification performance, especially in the semi-supervised scenario where category labels are sparse but unlabeled data is plentiful. Our approach enables advances in generative modeling to directly boost semi-supervised classification performance, an ability we demonstrate by augmenting deep generative models with latent variables capturing spatial transformations

arXiv.org e-Print Archive

Visual Interaction with Deep Learning Models through Collaborative Semantic Inference

Author: Gehrmann Sebastian
Krüger Robert
Pfister Hanspeter
Rush Alexander M.
Strobelt Hendrik
Publication venue
Publication date: 24/07/2019
Field of study

Automation of tasks can have critical consequences when humans lose agency over decision processes. Deep learning models are particularly susceptible since current black-box approaches lack explainable reasoning. We argue that both the visual interface and model structure of deep learning systems need to take into account interaction design. We propose a framework of collaborative semantic inference (CSI) for the co-design of interactions and models to enable visual collaboration between humans and algorithms. The approach exposes the intermediate reasoning process of models which allows semantic interactions with the visual metaphors of a problem, which means that a user can both understand and control parts of the model reasoning process. We demonstrate the feasibility of CSI with a co-designed case study of a document summarization system.Comment: IEEE VIS 2019 (VAST

arXiv.org e-Print Archive

POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning

Author: Doshi-Velez Finale
Futoma Joseph
Hughes Michael C.
Publication venue
Publication date: 31/03/2020
Field of study

Many medical decision-making tasks can be framed as partially observed Markov decision processes (POMDPs). However, prevailing two-stage approaches that first learn a POMDP and then solve it often fail because the model that best fits the data may not be well suited for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in batch off-policy settings that are typical in healthcare, when only retrospective data is available. We demonstrate our approach on synthetic examples and a challenging medical decision-making problem.Comment: Accepted to AISTATS 2020, Palermo, Ital

arXiv.org e-Print Archive