3,439 research outputs found
Estimating the Maximum Expected Value: An Analysis of (Nested) Cross Validation and the Maximum Sample Average
We investigate the accuracy of the two most common estimators for the maximum
expected value of a general set of random variables: a generalization of the
maximum sample average, and cross validation. No unbiased estimator exists and
we show that it is non-trivial to select a good estimator without knowledge
about the distributions of the random variables. We investigate and bound the
bias and variance of the aforementioned estimators and prove consistency. The
variance of cross validation can be significantly reduced, but not without
risking a large bias. The bias and variance of different variants of cross
validation are shown to be very problem-dependent, and a wrong choice can lead
to very inaccurate estimates
Bayesian Sampling Algorithms for the Sample Selection and Two-Part Models
This paper considers two models to deal with an outcome variable that contains a large fraction of zeros, such as individual expenditures on health care: a sample-selection model and a two-part model. The sample-selection model uses two possibly correlated processes to determine the outcome: a decision process and an outcome process; conditional on a favorable decision, the outcome is observed. The two-part model comprises uncorrelated decision and outcome processes. The paper addresses the issue of selecting between these two models. With a Gaussian specification of the likelihood, the models are nested and inference can focus on the correlation coefficient. Using a fully parametric Bayesian approach, I present sampling algorithms for the model parameters that are based on data augmentation. In addition to the sampler output of the correlation coefficient, a Bayes factor can be computed to distinguish between the models. The paper illustrates the methods and their potential pitfalls using simulated data setsSample Selection, Data Augmentation, Gibbs Sampling
Deep Reinforcement Learning with Double Q-learning
The popular Q-learning algorithm is known to overestimate action values under
certain conditions. It was not previously known whether, in practice, such
overestimations are common, whether they harm performance, and whether they can
generally be prevented. In this paper, we answer all these questions
affirmatively. In particular, we first show that the recent DQN algorithm,
which combines Q-learning with a deep neural network, suffers from substantial
overestimations in some games in the Atari 2600 domain. We then show that the
idea behind the Double Q-learning algorithm, which was introduced in a tabular
setting, can be generalized to work with large-scale function approximation. We
propose a specific adaptation to the DQN algorithm and show that the resulting
algorithm not only reduces the observed overestimations, as hypothesized, but
that this also leads to much better performance on several games.Comment: AAAI 201
Effect of linear polarisability and local fields on surface SHG
A discrete dipole model has been developed to describe Surface Second Harmonic Generation by centrosymmetric semiconductors. The double cell method, which enables the linear reflection problem to be solved numerically for semi-infinite systems, has been extended for the nonlinear case. It is shown that a single layer of nonlinear electric dipoles at the surface and nonlocal effects allows to describe the angle of incidence dependent anisotropic SHG obtained from oxidised Si(001) wafers. The influence of the linear response, turns out to be essential to understand the anisotropic SHG-process
Multi-task Deep Reinforcement Learning with PopArt
The reinforcement learning community has made great strides in designing
algorithms capable of exceeding human performance on specific tasks. These
algorithms are mostly trained one task at the time, each new task requiring to
train a brand new agent instance. This means the learning algorithm is general,
but each solution is not; each agent can only solve the one task it was trained
on. In this work, we study the problem of learning to master not one but
multiple sequential-decision tasks at once. A general issue in multi-task
learning is that a balance must be found between the needs of multiple tasks
competing for the limited resources of a single learning system. Many learning
algorithms can get distracted by certain tasks in the set of tasks to solve.
Such tasks appear more salient to the learning process, for instance because of
the density or magnitude of the in-task rewards. This causes the algorithm to
focus on those salient tasks at the expense of generality. We propose to
automatically adapt the contribution of each task to the agent's updates, so
that all tasks have a similar impact on the learning dynamics. This resulted in
state of the art performance on learning to play all games in a set of 57
diverse Atari games. Excitingly, our method learned a single trained policy -
with a single set of weights - that exceeds median human performance. To our
knowledge, this was the first time a single agent surpassed human-level
performance on this multi-task domain. The same approach also demonstrated
state of the art performance on a set of 30 tasks in the 3D reinforcement
learning platform DeepMind Lab
Виправте!
BACKGROUND: The aim of the current work was to perform a clinical trial simulation (CTS) analysis to optimize a drug-drug interaction (DDI) study of vincristine in children who also received azole antifungals, taking into account challenges of conducting clinical trials in this population, and, to provide a motivating example of the application of CTS in the design of pediatric oncology clinical trials. PROCEDURE: A pharmacokinetic (PK) model for vincristine in children was used to simulate concentration-time profiles. A continuous model for body surface area versus age was defined based on pediatric growth curves. Informative sampling time windows were derived using D-optimal design. The CTS framework was used to different magnitudes of clearance inhibition (10%, 25%, or 40%), sample size (30-500), the impact of missing samples or sampling occasions, and the age distribution, on the power to detect a significant inhibition effect, and in addition, the relative estimation error (REE) of the interaction effect. RESULTS: A minimum group specific sample size of 38 patients with a total sample size of 150 patients was required to detect a clearance inhibition effect of 40% with 80% power, while in the case of a lower effect of clearance inhibition, a substantially larger sample size was required. However, for the majority of re-estimated drug effects, the inhibition effect could be estimated precisely (REE < 25%) in even smaller sample sizes and with lower effect sizes. CONCLUSION: This work demonstrated the utility of CTS for the evaluation of PK clinical trial designs in the pediatric oncology population
- …
