149 research outputs found
Optimal Release of Inventory Using Online Auctions: The Two Item Case
In this paper we analyze policies for optimally disposing inventory using online auctions. We assume a seller has a ïŹxed number of items to sell using a sequence of, possibly overlapping, single-item auctions. The decision the seller must make is when to start each auction. The decision involves a trade-oïŹ between a holding cost for each period an item remains unsold, and a higher expected ïŹnal price the fewer the number of simultaneous auctions underway. Consequently the seller must trade-oïŹ the expected marginal gain for the ongoing auctions with the expected marginal cost of the unreleased items by further deferring their release. We formulate the problem as a discrete time Markov Decision Problem and consider two cases. In the ïŹrst case we assume the auctions are guaranteed to be successful, while in the second case we assume there is a positive probability that an auction receives no bids. The reason for considering these two cases are that they require diïŹerent analysis. We derive conditions to ensure that the optimal release policy is a control limit policy in the current price of the ongoing auctions, and provide several illustration of results. The paper focuses on the two item case which has suïŹcient complexity to raise challenging questions
Recommended from our members
A Mitochondrial Health Index Sensitive to Mood and Caregiving Stress.
BACKGROUND:Chronic life stress, such as the stress of caregiving, can promote pathophysiology, but the underlying cellular mechanisms are not well understood. Chronic stress may induce recalibrations in mitochondria leading to changes either in mitochondrial content per cell, or in mitochondrial functional capacity (i.e., quality). METHODS:Here we present a functional index of mitochondrial health (MHI) for human leukocytes that can distinguish between these two possibilities. The MHI integrates nuclear and mitochondrial DNA-encoded respiratory chain enzymatic activities and mitochondrial DNA copy number. We then use the MHI to test the hypothesis that daily emotional states and caregiving stress influence mitochondrial function by comparing healthy mothers of a child with an autism spectrum disorder (high-stress caregivers, n = 46) with mothers of a neurotypical child (control group, n = 45). RESULTS:The MHI outperformed individual mitochondrial function measures. Elevated positive mood at night was associated with higher MHI, and nightly positive mood was also a mediator of the association between caregiving and MHI. Moreover, MHI was correlated to positive mood on the days preceding, but not following the blood draw, suggesting for the first time in humans that mitochondria may respond to proximate emotional states within days. Correspondingly, the caregiver group, which had higher perceived stress and lower positive and greater negative daily affect, exhibited lower MHI. This effect was not explained by a mismatch between nuclear and mitochondrial genomes. CONCLUSIONS:Daily mood and chronic caregiving stress are associated with mitochondrial functional capacity. Mitochondrial health may represent a nexus between psychological stress and health
Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football
For decades, National Football League (NFL) coaches' observed fourth down
decisions have been largely inconsistent with prescriptions based on
statistical models. In this paper, we develop a framework to explain this
discrepancy using a novel inverse optimization approach. We model the fourth
down decision and the subsequent sequence of plays in a game as a Markov
decision process (MDP), the dynamics of which we estimate from NFL play-by-play
data from the 2014 through 2022 seasons. We assume that coaches' observed
decisions are optimal but that the risk preferences governing their decisions
are unknown. This yields a novel inverse decision problem for which the
optimality criterion, or risk measure, of the MDP is the estimand. Using the
quantile function to parameterize risk, we estimate which quantile-optimal
policy yields the coaches' observed decisions as minimally suboptimal. In
general, we find that coaches' fourth-down behavior is consistent with
optimizing low quantiles of the next-state value distribution, which
corresponds to conservative risk preferences. We also find that coaches exhibit
higher risk tolerances when making decisions in the opponent's half of the
field than in their own, and that league average fourth down risk tolerances
have increased over the seasons in our data.Comment: 33 pages, 9 figure
Optimal Strategies in Infinite-state Stochastic Reachability Games
We consider perfect-information reachability stochastic games for 2 players
on infinite graphs. We identify a subclass of such games, and prove two
interesting properties of it: first, Player Max always has optimal strategies
in games from this subclass, and second, these games are strongly determined.
The subclass is defined by the property that the set of all values can only
have one accumulation point -- 0. Our results nicely mirror recent results for
finitely-branching games, where, on the contrary, Player Min always has optimal
strategies. However, our proof methods are substantially different, because the
roles of the players are not symmetric. We also do not restrict the branching
of the games. Finally, we apply our results in the context of recently studied
One-Counter stochastic games
Decision Problems for Nash Equilibria in Stochastic Games
We analyse the computational complexity of finding Nash equilibria in
stochastic multiplayer games with -regular objectives. While the
existence of an equilibrium whose payoff falls into a certain interval may be
undecidable, we single out several decidable restrictions of the problem.
First, restricting the search space to stationary, or pure stationary,
equilibria results in problems that are typically contained in PSPACE and NP,
respectively. Second, we show that the existence of an equilibrium with a
binary payoff (i.e. an equilibrium where each player either wins or loses with
probability 1) is decidable. We also establish that the existence of a Nash
equilibrium with a certain binary payoff entails the existence of an
equilibrium with the same payoff in pure, finite-state strategies.Comment: 22 pages, revised versio
Solving Stochastic B\"uchi Games on Infinite Arenas with a Finite Attractor
We consider games played on an infinite probabilistic arena where the first
player aims at satisfying generalized B\"uchi objectives almost surely, i.e.,
with probability one. We provide a fixpoint characterization of the winning
sets and associated winning strategies in the case where the arena satisfies
the finite-attractor property. From this we directly deduce the decidability of
these games on probabilistic lossy channel systems.Comment: In Proceedings QAPL 2013, arXiv:1306.241
Computing Distances between Probabilistic Automata
We present relaxed notions of simulation and bisimulation on Probabilistic
Automata (PA), that allow some error epsilon. When epsilon is zero we retrieve
the usual notions of bisimulation and simulation on PAs. We give logical
characterisations of these notions by choosing suitable logics which differ
from the elementary ones, L with negation and L without negation, by the modal
operator. Using flow networks, we show how to compute the relations in PTIME.
This allows the definition of an efficiently computable non-discounted distance
between the states of a PA. A natural modification of this distance is
introduced, to obtain a discounted distance, which weakens the influence of
long term transitions. We compare our notions of distance to others previously
defined and illustrate our approach on various examples. We also show that our
distance is not expansive with respect to process algebra operators. Although L
without negation is a suitable logic to characterise epsilon-(bi)simulation on
deterministic PAs, it is not for general PAs; interestingly, we prove that it
does characterise weaker notions, called a priori epsilon-(bi)simulation, which
we prove to be NP-difficult to decide.Comment: In Proceedings QAPL 2011, arXiv:1107.074
Comparing Path Dependence and Spatial Targeting of Land Use in Implementing Climate Change Responses
Peer reviewedPublisher PD
Approximating the termination value of one-counter MDPs and stochastic games
One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value equal to 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in polynomial time (in NP intersection coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value at least p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given an OC-SSG, and given epsilon>0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain linear programs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs
- âŠ