93,262 research outputs found
Concrete Dropout
Dropout is used as a practical tool to obtain uncertainty estimates in large
vision models and reinforcement learning (RL) tasks. But to obtain
well-calibrated uncertainty estimates, a grid-search over the dropout
probabilities is necessary - a prohibitive operation with large models, and an
impossible one with RL. We propose a new dropout variant which gives improved
performance and better calibrated uncertainties. Relying on recent developments
in Bayesian deep learning, we use a continuous relaxation of dropout's discrete
masks. Together with a principled optimisation objective, this allows for
automatic tuning of the dropout probability in large models, and as a result
faster experimentation cycles. In RL this allows the agent to adapt its
uncertainty dynamically as more data is observed. We analyse the proposed
variant extensively on a range of tasks, and give insights into common practice
in the field where larger dropout probabilities are often used in deeper model
layers
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Policy gradient methods have had great success in solving continuous control
tasks, yet the stochastic nature of such problems makes deterministic value
estimation difficult. We propose an approach which instead estimates a
distribution by fitting the value function with a Bayesian Neural Network. We
optimize an -divergence objective with Bayesian dropout approximation
to learn and estimate this distribution. We show that using the Monte Carlo
posterior mean of the Bayesian value function distribution, rather than a
deterministic network, improves stability and performance of policy gradient
methods in continuous control MuJoCo simulations.Comment: Accepted to Bayesian Deep Learning Workshop at NIPS 201
Deductive semiparametric estimation in Double-Sampling Designs with application to PEPFAR
Non-ignorable dropout is common in studies with long follow-up time, and it
can bias study results unless handled carefully. A double-sampling design
allocates additional resources to pursue a subsample of the dropouts and find
out their outcomes, which can address potential biases due to non-ignorable
dropout. It is desirable to construct semiparametric estimators for the
double-sampling design because of their robustness properties. However,
obtaining such semiparametric estimators remains a challenge due to the
requirement of the analytic form of the efficient influence function (EIF), the
derivation of which can be ad hoc and difficult for the double-sampling design.
Recent work has shown how the derivation of EIF can be made deductive and
computerizable using the functional derivative representation of the EIF in
nonparametric models. This approach, however, requires deriving the mixture of
a continuous distribution and a point mass, which can itself be challenging for
complicated problems such as the double-sampling design. We propose
semiparametric estimators for the survival probability in double-sampling
designs by generalizing the deductive and computerizable estimation approach.
In particular, we propose to build the semiparametric estimators based on a
discretized support structure, which approximates the possibly continuous
observed data distribution and circumvents the derivation of the mixture
distribution. Our approach is deductive in the sense that it is expected to
produce semiparametric locally efficient estimators within finite steps without
knowledge of the EIF. We apply the proposed estimators to estimating the
mortality rate in a double-sampling design component of the President's
Emergency Plan for AIDS Relief (PEPFAR) program. We evaluate the impact of
double-sampling selection criteria on the mortality rate estimates
Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task
In this paper, we have investigated recurrent deep neural networks (DNNs) in
combination with regularization techniques as dropout, zoneout, and
regularization post-layer. As a benchmark, we chose the TIMIT phone recognition
task due to its popularity and broad availability in the community. It also
simulates a low-resource scenario that is helpful in minor languages. Also, we
prefer the phone recognition task because it is much more sensitive to an
acoustic model quality than a large vocabulary continuous speech recognition
task. In recent years, recurrent DNNs pushed the error rates in automatic
speech recognition down. But, there was no clear winner in proposed
architectures. The dropout was used as the regularization technique in most
cases, but combination with other regularization techniques together with model
ensembles was omitted. However, just an ensemble of recurrent DNNs performed
best and achieved an average phone error rate from 10 experiments 14.84 %
(minimum 14.69 %) on core test set that is slightly lower then the
best-published PER to date, according to our knowledge. Finally, in contrast of
the most papers, we published the open-source scripts to easily replicate the
results and to help continue the development.Comment: Submitted to SPECOM 2018, 20th International Conference on Speech and
Compute
A shared-parameter continuous-time hidden Markov and survival model for longitudinal data with informative dropout
A shared-parameter approach for jointly modeling longitudinal and survival data is proposed. With respect to available approaches, it allows for time-varying random effects that affect both the longitudinal and the survival processes. The distribution of these random effects is modeled according to a continuous-time hidden Markov chain so that transitions may occur at any time point. For maximum likelihood estimation, we propose an algorithm based on a discretization of time until censoring in an arbitrary number of time windows. The observed information matrix is used to obtain standard errors. We illustrate the approach by simulation, even with respect to the effect of the number of time windows on the precision of the estimates, and by an application to data about patients suffering from mildly dilated cardiomyopathy
Selection and pattern mixture models for modelling longitudinal data with dropout: An application study
Incomplete data are unavoidable in studies that involve data measured or observed longitudinally on individuals, regardless of how well they are designed. Dropout can potentially cause serious bias problems in the analysis of longitudinal data. In the presence of dropout, an appropriate strategy for analyzing such data would require the definition of a joint model for dropout and measurement processes. This paper is primarily concerned with selection and pattern mixture models as modelling frameworks that could be used for sensitivity analysis to jointly model the distribution for the dropout process and the longitudinal measurement process. We demonstrate the application of these models for handling dropout in longitudinal data where the dependent variable is missing across time. We restrict attention to the situation in which outcomes are continuous. The primary objectives are to investigate the potential influence that dropout might have or exert on the dependent measurement process based on the considered data as well as to deal with incomplete sequences. We apply the methods to a data set arising from a serum cholesterol study. The results obtained from these methods are then compared to help gain additional insight into the serum cholesterol data and assess sensitivity of the assumptions made. Results showed that additional confidence in the findings was gained as both models led to similar results when assessing significant effects, such as marginal treatment effects
- …