2,025 research outputs found
Online Supplement to `Efficient Simulation Resource Sharing and Allocation for Selecting the Best'
This is the online supplement to the article by the same authors, "Efficient Simulation Resource Sharing and Allocation for Selecting the Best," published in the IEEE Transactions on Automatic Control
Online Appendix for “Gradient-Based Myopic Allocation Policy: An Efficient Sampling Procedure in a Low-Confidence Scenario”
This is the online appendix, which includes theoretical and numerical supplements containing some technical details and three additional numerical examples, which could not fit in the main body due to page limits by the journal for a technical note.
The abstract for the main body is as follows:
In this note, we study a simulation optimization problem of selecting the alternative with the best performance from a finite set, or a so-called ranking and selection problem, in a special low-confidence scenario. The most popular sampling allocation procedures in ranking and selection do not perform well in this scenario, because they all ignore certain induced correlations that significantly affect the probability of correct selection in this scenario. We propose a gradient-based myopic allocation policy (G-MAP) that takes the induced correlations into account, reflecting a trade-off between the induced correlation and the two factors (mean-variance) found in the optimal computing budget allocation formula. Numerical experiments substantiate the efficiency of the new procedure in the low-confidence scenario.This work was supported in part by the National Science Foundation (NSF) under Grants CMMI-0856256, CMMI- 1362303, CMMI-1434419, by the National Natural Science Foundation of China (NSFC) under Grants 71571048, by the Air Force of Scientific Research (AFOSR) under Grant FA9550-15-10050, and by the Science and Technology Agency of Sichuan Province under Grant 2014GZX0002
Maximum Entropy Heterogeneous-Agent Mirror Learning
Multi-agent reinforcement learning (MARL) has been shown effective for
cooperative games in recent years. However, existing state-of-the-art methods
face challenges related to sample inefficiency, brittleness regarding
hyperparameters, and the risk of converging to a suboptimal Nash Equilibrium.
To resolve these issues, in this paper, we propose a novel theoretical
framework, named Maximum Entropy Heterogeneous-Agent Mirror Learning (MEHAML),
that leverages the maximum entropy principle to design maximum entropy MARL
actor-critic algorithms. We prove that algorithms derived from the MEHAML
framework enjoy the desired properties of the monotonic improvement of the
joint maximum entropy objective and the convergence to quantal response
equilibrium (QRE). The practicality of MEHAML is demonstrated by developing a
MEHAML extension of the widely used RL algorithm, HASAC (for soft
actor-critic), which shows significant improvements in exploration and
robustness on three challenging benchmarks: Multi-Agent MuJoCo, StarCraftII,
and Google Research Football. Our results show that HASAC outperforms strong
baseline methods such as HATD3, HAPPO, QMIX, and MAPPO, thereby establishing
the new state of the art. See our project page at
https://sites.google.com/view/mehaml
Application of Perturbation Analysis to the Design and Analysis of Control Charts
The design of control charts in statistical quality control addresses the optimal selection of the design parameters such as the sampling frequency and the control limits; and includes sensitivity analysis with respect to system parameters such as the various process parameters and the economic costs of sampling. The advent of more complicated control chart schemes has necessitated the use of Monte Carlo simulation in the design process, particularly in the evaluation of performance measures such as average run length. In this paper, we apply perturbation analysis to derive gradient estimators that can be used in gradient-based optimization algorithms and in sensitivity analysis when Monte Carlo simulation is employed. We illustrate the technique on a simple Shewhart control chart and on a more complicated control chart that includes the exponentially- weighted moving average control chart as a special case
Sensitivity Analysis for Monte Carlo Simulation of Option Pricing
corrections to published article;
additional tables for numerical resultsMonte Carlo simulation is one alternative for analyzing options markets
when the assumptions of simpler analytical models are violated.
We introduce techniques for the sensitivity analysis of option pricing
which can be efficiently carried out in the simulation.
In particular, using these techniques,
a single run of the simulation would often provide not only
an estimate of the option value
but also estimates of the sensitivities of the option value to
various parameters of the model.
Both European and American options are considered,
starting with simple
analytically tractable models to present the idea and
proceeding to more complicated examples.
We then propose an approach for
the pricing of options with early exercise features by
incorporating the gradient estimates in
an iterative stochastic approximation algorithm.
The procedure is illustrated in a simple example estimating
the option value of an American call.
Numerical results indicate that the additional computational
effort required over that required
to estimate a European option is relatively small
A New Insight into the Role of CART in Cocaine Reward: Involvement of CaMKII and Inhibitory G-Protein Coupled Receptor Signaling
Cocaine- and amphetamine-regulated transcript (CART) peptides are neuropeptides that are expressed in brain regions associated with reward, such as the nucleus accumbens (NAc), and play a role in cocaine reward. Injection of CART into the NAc can inhibit the behavioral effects of cocaine, and injecting CART into the ventral tegmental area (VTA) reduces cocaine-seeking behavior. However, the exact mechanism of these effects is not clear. Recent research has demonstrated that Ca2+/calmodulin-dependent protein kinase II (CaMKII) and inhibitory G-protein coupled receptor (GPCR) signaling are involved in the mechanism of the effect of CART on cocaine reward. Hence, we review the role of CaMKII and inhibitory GPCR signaling in the effect of CART on cocaine reward and provide a new insight into the mechanism of that effect. In this article, we will first review the biological function of CART and discuss the role of CART in cocaine reward. Then, we will focus on the role of CaMKII and inhibitory GPCR signaling in cocaine reward. Furthermore, we will discuss how CaMKII and inhibitory GPCR signaling are involved in the mechanistic action of CART in cocaine reward. Finally, we will provide our opinions regarding the future directions of research on the role of CaMKII and inhibitory GPCR signaling in the effect of CART on cocaine reward
- …