24,893 research outputs found
Risk Aversion in Finite Markov Decision Processes Using Total Cost Criteria and Average Value at Risk
In this paper we present an algorithm to compute risk averse policies in
Markov Decision Processes (MDP) when the total cost criterion is used together
with the average value at risk (AVaR) metric. Risk averse policies are needed
when large deviations from the expected behavior may have detrimental effects,
and conventional MDP algorithms usually ignore this aspect. We provide
conditions for the structure of the underlying MDP ensuring that approximations
for the exact problem can be derived and solved efficiently. Our findings are
novel inasmuch as average value at risk has not previously been considered in
association with the total cost criterion. Our method is demonstrated in a
rapid deployment scenario, whereby a robot is tasked with the objective of
reaching a target location within a temporal deadline where increased speed is
associated with increased probability of failure. We demonstrate that the
proposed algorithm not only produces a risk averse policy reducing the
probability of exceeding the expected temporal deadline, but also provides the
statistical distribution of costs, thus offering a valuable analysis tool
Sensor Scheduling for Energy-Efficient Target Tracking in Sensor Networks
In this paper we study the problem of tracking an object moving randomly
through a network of wireless sensors. Our objective is to devise strategies
for scheduling the sensors to optimize the tradeoff between tracking
performance and energy consumption. We cast the scheduling problem as a
Partially Observable Markov Decision Process (POMDP), where the control actions
correspond to the set of sensors to activate at each time step. Using a
bottom-up approach, we consider different sensing, motion and cost models with
increasing levels of difficulty. At the first level, the sensing regions of the
different sensors do not overlap and the target is only observed within the
sensing range of an active sensor. Then, we consider sensors with overlapping
sensing range such that the tracking error, and hence the actions of the
different sensors, are tightly coupled. Finally, we consider scenarios wherein
the target locations and sensors' observations assume values on continuous
spaces. Exact solutions are generally intractable even for the simplest models
due to the dimensionality of the information and action spaces. Hence, we
devise approximate solution techniques, and in some cases derive lower bounds
on the optimal tradeoff curves. The generated scheduling policies, albeit
suboptimal, often provide close-to-optimal energy-tracking tradeoffs
Non-Zero Sum Games for Reactive Synthesis
In this invited contribution, we summarize new solution concepts useful for
the synthesis of reactive systems that we have introduced in several recent
publications. These solution concepts are developed in the context of non-zero
sum games played on graphs. They are part of the contributions obtained in the
inVEST project funded by the European Research Council.Comment: LATA'16 invited pape
- …