108 research outputs found
Active Collaborative Filtering
Collaborative filtering (CF) allows the preferences of multiple users to be
pooled to make recommendations regarding unseen products. We consider in this
paper the problem of online and interactive CF: given the current ratings
associated with a user, what queries (new ratings) would most improve the
quality of the recommendations made? We cast this terms of expected value of
information (EVOI); but the online computational cost of computing optimal
queries is prohibitive. We show how offline prototyping and computation of
bounds on EVOI can be used to dramatically reduce the required online
computation. The framework we develop is general, but we focus on derivations
and empirical study in the specific case of the multiple-cause vector
quantization model.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series
This paper proposes an out-of-sample extension framework for a global
manifold learning algorithm (Isomap) that uses temporal information in
out-of-sample points in order to make the embedding more robust to noise and
artifacts. Given a set of noise-free training data and its embedding, the
proposed framework extends the embedding for a noisy time series. This is
achieved by adding a spatio-temporal compactness term to the optimization
objective of the embedding. To the best of our knowledge, this is the first
method for out-of-sample extension of manifold embeddings that leverages timing
information available for the extension set. Experimental results demonstrate
that our out-of-sample extension algorithm renders a more robust and accurate
embedding of sequentially ordered image data in the presence of various noise
and artifacts when compared to other timing-aware embeddings. Additionally, we
show that an out-of-sample extension framework based on the proposed algorithm
outperforms the state of the art in eye-gaze estimation
Generalized Bayesian Posterior Expectation Distillation for Deep Neural Networks
In this paper, we present a general framework for distilling expectations
with respect to the Bayesian posterior distribution of a deep neural network
classifier, extending prior work on the Bayesian Dark Knowledge framework. The
proposed framework takes as input "teacher" and student model architectures and
a general posterior expectation of interest. The distillation method performs
an online compression of the selected posterior expectation using iteratively
generated Monte Carlo samples. We focus on the posterior predictive
distribution and expected entropy as distillation targets. We investigate
several aspects of this framework including the impact of uncertainty and the
choice of student model architecture. We study methods for student model
architecture search from a speed-storage-accuracy perspective and evaluate
down-stream tasks leveraging entropy distillation including uncertainty ranking
and out-of-distribution detection.Comment: Accepted at UAI '2
Learning Shallow Detection Cascades for Wearable Sensor-Based Mobile Health Applications
The field of mobile health aims to leverage recent advances in wearable
on-body sensing technology and smart phone computing capabilities to develop
systems that can monitor health states and deliver just-in-time adaptive
interventions. However, existing work has largely focused on analyzing
collected data in the off-line setting. In this paper, we propose a novel
approach to learning shallow detection cascades developed explicitly for use in
a real-time wearable-phone or wearable-phone-cloud systems. We apply our
approach to the problem of cigarette smoking detection from a combination of
wrist-worn actigraphy data and respiration chest band data using two and three
stage cascades
Integrating Propositional and Relational Label Side Information for Hierarchical Zero-Shot Image Classification
Zero-shot learning (ZSL) is one of the most extreme forms of learning from
scarce labeled data. It enables predicting that images belong to classes for
which no labeled training instances are available. In this paper, we present a
new ZSL framework that leverages both label attribute side information and a
semantic label hierarchy. We present two methods, lifted zero-shot prediction
and a custom conditional random field (CRF) model, that integrate both forms of
side information. We propose benchmark tasks for this framework that focus on
making predictions across a range of semantic levels. We show that lifted
zero-shot prediction can dramatically outperform baseline methods when making
predictions within specified semantic levels, and that the probability
distribution provided by the CRF model can be leveraged to yield further
performance improvements when making unconstrained predictions over the
hierarchy
Integrating Physiological Time Series and Clinical Notes with Deep Learning for Improved ICU Mortality Prediction
Intensive Care Unit Electronic Health Records (ICU EHRs) store multimodal
data about patients including clinical notes, sparse and irregularly sampled
physiological time series, lab results, and more. To date, most methods
designed to learn predictive models from ICU EHR data have focused on a single
modality. In this paper, we leverage the recently proposed
interpolation-prediction deep learning architecture(Shukla and Marlin 2019) as
a basis for exploring how physiological time series data and clinical notes can
be integrated into a unified mortality prediction model. We study both early
and late fusion approaches and demonstrate how the relative predictive value of
clinical text and physiological data change over time. Our results show that a
late fusion approach can provide a statistically significant improvement in
mortality prediction performance over using individual modalities in isolation.Comment: Presented at ACM Conference on Health, Inference and Learning
(Workshop Track), 202
A Survey on Principles, Models and Methods for Learning from Irregularly Sampled Time Series
Irregularly sampled time series data arise naturally in many application
domains including biology, ecology, climate science, astronomy, and health.
Such data represent fundamental challenges to many classical models from
machine learning and statistics due to the presence of non-uniform intervals
between observations. However, there has been significant progress within the
machine learning community over the last decade on developing specialized
models and architectures for learning from irregularly sampled univariate and
multivariate time series data. In this survey, we first describe several axes
along which approaches to learning from irregularly sampled time series differ
including what data representations they are based on, what modeling primitives
they leverage to deal with the fundamental problem of irregular sampling, and
what inference tasks they are designed to perform. We then survey the recent
literature organized primarily along the axis of modeling primitives. We
describe approaches based on temporal discretization, interpolation,
recurrence, attention and structural invariance. We discuss similarities and
differences between approaches and highlight primary strengths and weaknesses.Comment: Presented at NeurIPS 2020 Workshop: ML Retrospectives, Surveys &
Meta-Analyses (ML-RSA
Group Sparse Priors for Covariance Estimation
Recently it has become popular to learn sparse Gaussian graphical models
(GGMs) by imposing l1 or group l1,2 penalties on the elements of the precision
matrix. Thispenalized likelihood approach results in a tractable convex
optimization problem. In this paper, we reinterpret these results as performing
MAP estimation under a novel prior which we call the group l1 and l1,2
positivedefinite matrix distributions. This enables us to build a hierarchical
model in which the l1 regularization terms vary depending on which group the
entries are assigned to, which in turn allows us to learn block structured
sparse GGMs with unknown group assignments. Exact inference in this
hierarchical model is intractable, due to the need to compute the normalization
constant of these matrix distributions. However, we derive upper bounds on the
partition functions, which lets us use fast variational inference (optimizing a
lower bound on the joint posterior). We show that on two real world data sets
(motion capture and financial data), our method which infers the block
structure outperforms a method that uses a fixed block structure, which in turn
outperforms baseline methods that ignore block structure.Comment: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty
in Artificial Intelligence (UAI2009
Modeling Irregularly Sampled Clinical Time Series
While the volume of electronic health records (EHR) data continues to grow,
it remains rare for hospital systems to capture dense physiological data
streams, even in the data-rich intensive care unit setting. Instead, typical
EHR records consist of sparse and irregularly observed multivariate time
series, which are well understood to present particularly challenging problems
for machine learning methods. In this paper, we present a new deep learning
architecture for addressing this problem based on the use of a semi-parametric
interpolation network followed by the application of a prediction network. The
interpolation network allows for information to be shared across multiple
dimensions during the interpolation stage, while any standard deep learning
model can be used for the prediction network. We investigate the performance of
this architecture on the problems of mortality and length of stay prediction.Comment: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018
arXiv:cs/010120
Interpolation-Prediction Networks for Irregularly Sampled Time Series
In this paper, we present a new deep learning architecture for addressing the
problem of supervised learning with sparse and irregularly sampled multivariate
time series. The architecture is based on the use of a semi-parametric
interpolation network followed by the application of a prediction network. The
interpolation network allows for information to be shared across multiple
dimensions of a multivariate time series during the interpolation stage, while
any standard deep learning model can be used for the prediction network. This
work is motivated by the analysis of physiological time series data in
electronic health records, which are sparse, irregularly sampled, and
multivariate. We investigate the performance of this architecture on both
classification and regression tasks, showing that our approach outperforms a
range of baseline and recently proposed models.Comment: International Conference on Learning Representations. arXiv admin
note: substantial text overlap with arXiv:1812.0053
- …