1,484 research outputs found
A Branching Process for Convergent Product Optimization
We consider a discrete time branching process where the population consists of k types of convergent products, an action is chosen for that which affects the lifetime, the number and types of its functions, and the profit received. The problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem where the transition matrices are non-negative but not necessarily sub stochastic
Nonparametric Infinite Horizon Kullback-Leibler Stochastic Control
We present two nonparametric approaches to Kullback-Leibler (KL) control, or
linearly-solvable Markov decision problem (LMDP) based on Gaussian processes
(GP) and Nystr\"{o}m approximation. Compared to recently developed parametric
methods, the proposed data-driven frameworks feature accurate function
approximation and efficient on-line operations. Theoretically, we derive the
mathematical connection of KL control based on dynamic programming with earlier
work in control theory which relies on information theoretic dualities for the
infinite time horizon case. Algorithmically, we give explicit optimal control
policies in nonparametric forms, and propose on-line update schemes with
budgeted computational costs. Numerical results demonstrate the effectiveness
and usefulness of the proposed frameworks
Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System
Previous researchers have modelled the decision to accept a donor organ for transplantation as a Markov decision problem, the solution to which is often a control-limit optimal policy: accept any organ whose match quality exceeds some health-dependent threshold; otherwise, wait for another. When competing transplant centers vie for the same organs, the decision rule changes relative to no competition; the relative size of competing centers affects the decision rules as well. Using center-specific graft and patient survival-rate data for cadaveric adult livers in the United States, we have found empirical evidence supporting these predictions.liver transplantation, competition, optimal stopping
Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System
Previous researchers have modelled the decision to accept a donor organ for transplantation as a Markov decision problem, the solution to which is often a control-limit optimal policy: accept any organ whose match quality exceeds some health-dependent threshold; otherwise, wait for another. When competing transplant centers vie for the same organs, the decision rule changes relative to no competition; the relative size of competing centers affects the decision rules as well. Using center-specific graft and patient survival-rate data for cadaveric adult livers in the United States, we have found empirical evidence supporting these predictions.liver transplantation; competition; optimal stopping
Asynchronous Gossip for Averaging and Spectral Ranking
We consider two variants of the classical gossip algorithm. The first variant
is a version of asynchronous stochastic approximation. We highlight a
fundamental difficulty associated with the classical asynchronous gossip
scheme, viz., that it may not converge to a desired average, and suggest an
alternative scheme based on reinforcement learning that has guaranteed
convergence to the desired average. We then discuss a potential application to
a wireless network setting with simultaneous link activation constraints. The
second variant is a gossip algorithm for distributed computation of the
Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant
draws upon a reinforcement learning algorithm for an average cost controlled
Markov decision problem, the second variant draws upon a reinforcement learning
algorithm for risk-sensitive control. We then discuss potential applications of
the second variant to ranking schemes, reputation networks, and principal
component analysis.Comment: 14 pages, 7 figures. Minor revisio
- …