1,484 research outputs found

    A Branching Process for Convergent Product Optimization

    Get PDF
    We consider a discrete time branching process where the population consists of k types of convergent products, an action is chosen for that which affects the lifetime, the number and types of its functions, and the profit received. The problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem where the  transition matrices are non-negative but not necessarily sub stochastic

    Nonparametric Infinite Horizon Kullback-Leibler Stochastic Control

    Full text link
    We present two nonparametric approaches to Kullback-Leibler (KL) control, or linearly-solvable Markov decision problem (LMDP) based on Gaussian processes (GP) and Nystr\"{o}m approximation. Compared to recently developed parametric methods, the proposed data-driven frameworks feature accurate function approximation and efficient on-line operations. Theoretically, we derive the mathematical connection of KL control based on dynamic programming with earlier work in control theory which relies on information theoretic dualities for the infinite time horizon case. Algorithmically, we give explicit optimal control policies in nonparametric forms, and propose on-line update schemes with budgeted computational costs. Numerical results demonstrate the effectiveness and usefulness of the proposed frameworks

    Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System

    Get PDF
    Previous researchers have modelled the decision to accept a donor organ for transplantation as a Markov decision problem, the solution to which is often a control-limit optimal policy: accept any organ whose match quality exceeds some health-dependent threshold; otherwise, wait for another. When competing transplant centers vie for the same organs, the decision rule changes relative to no competition; the relative size of competing centers affects the decision rules as well. Using center-specific graft and patient survival-rate data for cadaveric adult livers in the United States, we have found empirical evidence supporting these predictions.liver transplantation, competition, optimal stopping

    Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System

    Get PDF
    Previous researchers have modelled the decision to accept a donor organ for transplantation as a Markov decision problem, the solution to which is often a control-limit optimal policy: accept any organ whose match quality exceeds some health-dependent threshold; otherwise, wait for another. When competing transplant centers vie for the same organs, the decision rule changes relative to no competition; the relative size of competing centers affects the decision rules as well. Using center-specific graft and patient survival-rate data for cadaveric adult livers in the United States, we have found empirical evidence supporting these predictions.liver transplantation; competition; optimal stopping

    Asynchronous Gossip for Averaging and Spectral Ranking

    Full text link
    We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.Comment: 14 pages, 7 figures. Minor revisio
    corecore