Search CORE

1,484 research outputs found

A Branching Process for Convergent Product Optimization

Author: Aghasi Ermia
Momeni Mansour
Shah Hoseini Mohammad Ali
Publication venue: 'Scholink Co, Ltd.'
Publication date: 18/07/2017
Field of study

We consider a discrete time branching process where the population consists of k types of convergent products, an action is chosen for that which affects the lifetime, the number and types of its functions, and the profit received. The problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem of maximizing the expected profit is shown to be equivalent to a generalized Markov decision problem where the transition matrices are non-negative but not necessarily sub stochastic

Scholink Journals

Nonparametric Infinite Horizon Kullback-Leibler Stochastic Control

Author: Pan Yunpeng
Theodorou Evangelos
Publication venue
Publication date: 15/06/2016
Field of study

We present two nonparametric approaches to Kullback-Leibler (KL) control, or linearly-solvable Markov decision problem (LMDP) based on Gaussian processes (GP) and Nystr\"{o}m approximation. Compared to recently developed parametric methods, the proposed data-driven frameworks feature accurate function approximation and efficient on-line operations. Theoretically, we derive the mathematical connection of KL control based on dynamic programming with earlier work in control theory which relies on information theoretic dualities for the infinite time horizon case. Algorithmically, we give explicit optimal control policies in nonparametric forms, and propose on-line update schemes with budgeted computational costs. Numerical results demonstrate the effectiveness and usefulness of the proposed frameworks

arXiv.org e-Print Archive

CiteSeerX

Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System

Author: Halldorson Jeffrey B.
Paarsch Harry J.
Roberts John P.
Segre Alberto M.
Publication venue
Publication date
Field of study

Previous researchers have modelled the decision to accept a donor organ for transplantation as a Markov decision problem, the solution to which is often a control-limit optimal policy: accept any organ whose match quality exceeds some health-dependent threshold; otherwise, wait for another. When competing transplant centers vie for the same organs, the decision rule changes relative to no competition; the relative size of competing centers affects the decision rules as well. Using center-specific graft and patient survival-rate data for cadaveric adult livers in the United States, we have found empirical evidence supporting these predictions.liver transplantation, competition, optimal stopping

Research Papers in Economics

Competition and Post-Transplant Outcomes in Cadaveric Liver Transplantation under the MELD Scoring System

Author: Alberto M. Segre
Harry J. Paarsch
Jeffrey B. Halldorson
John P. Roberts
Publication venue
Publication date
Field of study

Research Papers in Economics

Asynchronous Gossip for Averaging and Spectral Ranking

Author: Borkar Vivek S.
Makhijani Rahul
Sundaresan Rajesh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.Comment: 14 pages, 7 figures. Minor revisio

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

Dspace at IIT Bombay