25,163 research outputs found
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis
The past decade has seen an explosion in the amount of digital information
stored in electronic health records (EHR). While primarily designed for
archiving patient clinical information and administrative healthcare tasks,
many researchers have found secondary use of these records for various clinical
informatics tasks. Over the same period, the machine learning community has
seen widespread advances in deep learning techniques, which also have been
successfully applied to the vast amount of EHR data. In this paper, we review
these deep EHR systems, examining architectures, technical aspects, and
clinical applications. We also identify shortcomings of current techniques and
discuss avenues of future research for EHR-based deep learning.Comment: Accepted for publication with Journal of Biomedical and Health
Informatics: http://ieeexplore.ieee.org/abstract/document/8086133
Bayesian nonparametric sparse VAR models
High dimensional vector autoregressive (VAR) models require a large number of
parameters to be estimated and may suffer of inferential problems. We propose a
new Bayesian nonparametric (BNP) Lasso prior (BNP-Lasso) for high-dimensional
VAR models that can improve estimation efficiency and prediction accuracy. Our
hierarchical prior overcomes overparametrization and overfitting issues by
clustering the VAR coefficients into groups and by shrinking the coefficients
of each group toward a common location. Clustering and shrinking effects
induced by the BNP-Lasso prior are well suited for the extraction of causal
networks from time series, since they account for some stylized facts in
real-world networks, which are sparsity, communities structures and
heterogeneity in the edges intensity. In order to fully capture the richness of
the data and to achieve a better understanding of financial and macroeconomic
risk, it is therefore crucial that the model used to extract network accounts
for these stylized facts.Comment: Forthcoming in "Journal of Econometrics" ---- Revised Version of the
paper "Bayesian nonparametric Seemingly Unrelated Regression Models" ----
Supplementary Material available on reques
Automatic structure estimation of predictive models for symptom development
Online mental health treatment has the premise to meet the increasing demand
for mental health treatment at a lower cost than traditional treatment.
However, online treatment suffers from high drop-out rates, which might negate
their cost effectiveness. Predictive models might aid in early identification
of deviating clients which allows to target them directly to prevent drop-out
and improve treatment outcomes. We propose a two-staged multi-objective
optimization process to automatically infer model structures based on
ecological momentary assessment for prediction of future symptom development.
The proposed multi-objective optimization approach results in a temporal-causal
network model with the best prediction performance for each concept. This
allows for a selection of a disorder-specific model structure based on the
envisioned field of application
Regularized estimation in sparse high-dimensional time series models
Many scientific and economic problems involve the analysis of
high-dimensional time series datasets. However, theoretical studies in
high-dimensional statistics to date rely primarily on the assumption of
independent and identically distributed (i.i.d.) samples. In this work, we
focus on stable Gaussian processes and investigate the theoretical properties
of -regularized estimates in two important statistical problems in the
context of high-dimensional time series: (a) stochastic regression with
serially correlated errors and (b) transition matrix estimation in vector
autoregressive (VAR) models. We derive nonasymptotic upper bounds on the
estimation errors of the regularized estimates and establish that consistent
estimation under high-dimensional scaling is possible via
-regularization for a large class of stable processes under sparsity
constraints. A key technical contribution of the work is to introduce a measure
of stability for stationary processes using their spectral properties that
provides insight into the effect of dependence on the accuracy of the
regularized estimates. With this proposed stability measure, we establish some
useful deviation bounds for dependent data, which can be used to study several
important regularized estimates in a time series setting.Comment: Published at http://dx.doi.org/10.1214/15-AOS1315 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Connecting the Dots: Identifying Network Structure via Graph Signal Processing
Network topology inference is a prominent problem in Network Science. Most
graph signal processing (GSP) efforts to date assume that the underlying
network is known, and then analyze how the graph's algebraic and spectral
characteristics impact the properties of the graph signals of interest. Such an
assumption is often untenable beyond applications dealing with e.g., directly
observable social and infrastructure networks; and typically adopted graph
construction schemes are largely informal, distinctly lacking an element of
validation. This tutorial offers an overview of graph learning methods
developed to bridge the aforementioned gap, by using information available from
graph signals to infer the underlying graph topology. Fairly mature statistical
approaches are surveyed first, where correlation analysis takes center stage
along with its connections to covariance selection and high-dimensional
regression for learning Gaussian graphical models. Recent GSP-based network
inference frameworks are also described, which postulate that the network
exists as a latent underlying structure, and that observations are generated as
a result of a network process defined in such a graph. A number of arguably
more nascent topics are also briefly outlined, including inference of dynamic
networks, nonlinear models of pairwise interaction, as well as extensions to
directed graphs and their relation to causal inference. All in all, this paper
introduces readers to challenges and opportunities for signal processing
research in emerging topic areas at the crossroads of modeling, prediction, and
control of complex behavior arising in networked systems that evolve over time
Sparse Bayesian vector autoregressions in huge dimensions
We develop a Bayesian vector autoregressive (VAR) model with multivariate
stochastic volatility that is capable of handling vast dimensional information
sets. Three features are introduced to permit reliable estimation of the model.
First, we assume that the reduced-form errors in the VAR feature a factor
stochastic volatility structure, allowing for conditional equation-by-equation
estimation. Second, we apply recently developed global-local shrinkage priors
to the VAR coefficients to cure the curse of dimensionality. Third, we utilize
recent innovations to efficiently sample from high-dimensional multivariate
Gaussian distributions. This makes simulation-based fully Bayesian inference
feasible when the dimensionality is large but the time series length is
moderate. We demonstrate the merits of our approach in an extensive simulation
study and apply the model to US macroeconomic data to evaluate its forecasting
capabilities
Machine Learning Methods Economists Should Know About
We discuss the relevance of the recent Machine Learning (ML) literature for
economics and econometrics. First we discuss the differences in goals, methods
and settings between the ML literature and the traditional econometrics and
statistics literatures. Then we discuss some specific methods from the machine
learning literature that we view as important for empirical researchers in
economics. These include supervised learning methods for regression and
classification, unsupervised learning methods, as well as matrix completion
methods. Finally, we highlight newly developed methods at the intersection of
ML and econometrics, methods that typically perform better than either
off-the-shelf ML or more traditional econometric methods when applied to
particular classes of problems, problems that include causal inference for
average treatment effects, optimal policy estimation, and estimation of the
counterfactual effect of price changes in consumer choice models
Lasso Guarantees for Time Series Estimation Under Subgaussian Tails and -Mixing
Many theoretical results on estimation of high dimensional time series
require specifying an underlying data generating model (DGM). Instead, along
the footsteps of~\cite{wong2017lasso}, this paper relies only on (strict)
stationarity and -mixing condition to establish consistency of lasso
when data comes from a -mixing process with marginals having subgaussian
tails. Because of the general assumptions, the data can come from DGMs
different than standard time series models such as VAR or ARCH. When the true
DGM is not VAR, the lasso estimates correspond to those of the best linear
predictors using the past observations. We establish non-asymptotic
inequalities for estimation and prediction errors of the lasso estimates.
Together with~\cite{wong2017lasso}, we provide lasso guarantees that cover full
spectrum of the parameters in specifications of -mixing subgaussian
time series. Applications of these results potentially extend to non-Gaussian,
non-Markovian and non-linear times series models as the examples we provide
demonstrate. In order to prove our results, we derive a novel Hanson-Wright
type concentration inequality for -mixing subgaussian random vectors
that may be of independent interest
Nonnegative Restricted Boltzmann Machines for Parts-based Representations Discovery and Predictive Model Stabilization
The success of any machine learning system depends critically on effective
representations of data. In many cases, it is desirable that a representation
scheme uncovers the parts-based, additive nature of the data. Of current
representation learning schemes, restricted Boltzmann machines (RBMs) have
proved to be highly effective in unsupervised settings. However, when it comes
to parts-based discovery, RBMs do not usually produce satisfactory results. We
enhance such capacity of RBMs by introducing nonnegativity into the model
weights, resulting in a variant called nonnegative restricted Boltzmann machine
(NRBM). The NRBM produces not only controllable decomposition of data into
interpretable parts but also offers a way to estimate the intrinsic nonlinear
dimensionality of data, and helps to stabilize linear predictive models. We
demonstrate the capacity of our model on applications such as handwritten digit
recognition, face recognition, document classification and patient readmission
prognosis. The decomposition quality on images is comparable with or better
than what produced by the nonnegative matrix factorization (NMF), and the
thematic features uncovered from text are qualitatively interpretable in a
similar manner to that of the latent Dirichlet allocation (LDA). The stability
performance of feature selection on medical data is better than RBM and
competitive with NMF. The learned features, when used for classification, are
more discriminative than those discovered by both NMF and LDA and comparable
with those by RBM
Genesis of Basic and Multi-Layer Echo State Network Recurrent Autoencoders for Efficient Data Representations
It is a widely accepted fact that data representations intervene noticeably
in machine learning tools. The more they are well defined the better the
performance results are. Feature extraction-based methods such as autoencoders
are conceived for finding more accurate data representations from the original
ones. They efficiently perform on a specific task in terms of 1) high accuracy,
2) large short term memory and 3) low execution time. Echo State Network (ESN)
is a recent specific kind of Recurrent Neural Network which presents very rich
dynamics thanks to its reservoir-based hidden layer. It is widely used in
dealing with complex non-linear problems and it has outperformed classical
approaches in a number of tasks including regression, classification, etc. In
this paper, the noticeable dynamism and the large memory provided by ESN and
the strength of Autoencoders in feature extraction are gathered within an ESN
Recurrent Autoencoder (ESN-RAE). In order to bring up sturdier alternative to
conventional reservoir-based networks, not only single layer basic ESN is used
as an autoencoder, but also Multi-Layer ESN (ML-ESN-RAE). The new features,
once extracted from ESN's hidden layer, are applied to classification tasks.
The classification rates rise considerably compared to those obtained when
applying the original data features. An accuracy-based comparison is performed
between the proposed recurrent AEs and two variants of an ELM feed-forward AEs
(Basic and ML) in both of noise free and noisy environments. The empirical
study reveals the main contribution of recurrent connections in improving the
classification performance results.Comment: 13 pages, 9 figure
- …