263,945 research outputs found
Optimal admission policies for small star networks
In this thesis admission stationary policies for small Symmetric Star telecommunication networks in which there are two types of calls requesting access are considered. Arrivals form independent Poisson streams on each route. We consider the routing to be fixed. The holding times of the calls are exponentially distributed periods of time. Rewards are earned for carrying calls and future returns are discounted at a fixed rate. The operation of the network is viewed as a Markov Decision Process and we solve the optimality equation for this network model numerically for a range of small examples by using the policy improvement algorithm of Dynamic Programming. The optimal policies we study involve acceptance or rejection of traffic requests in order to maximise the Total Expected Discounted Reward. Our Star networks are in some respect the simplest networks more complex than single links in isolation but even so only very small examples can be treated numerically. From those examples we find evidence that suggests that despite their complexity, optimal policies have some interesting properties. Admission Price policies are also investigated in this thesis. These policies are not optimal but they are believed to be asymptotically optimal for large networks. In this thesis we investigate if such policies are any good for small networks; we suggest that they are. A reduced state-space model is also considered in which a call on a 2-link route, once accepted, is split into two independent calls on the links involved. This greatly reduces the size of the state-space. We present properties of the optimal policies and the Admission Price policies and conclude that they are very good for the examples considered. Finally we look at Asymmetric Star networks with different number of circuits per link and different exponential holding times. Properties of the optimal policies as well as Admission Price policies are investigated for such networks
A High Reliability Asymptotic Approach for Packet Inter-Delivery Time Optimization in Cyber-Physical Systems
In cyber-physical systems such as automobiles, measurement data from sensor
nodes should be delivered to other consumer nodes such as actuators in a
regular fashion. But, in practical systems over unreliable media such as
wireless, it is a significant challenge to guarantee small enough
inter-delivery times for different clients with heterogeneous channel
conditions and inter-delivery requirements. In this paper, we design scheduling
policies aiming at satisfying the inter-delivery requirements of such clients.
We formulate the problem as a risk-sensitive Markov Decision Process (MDP).
Although the resulting problem involves an infinite state space, we first prove
that there is an equivalent MDP involving only a finite number of states. Then
we prove the existence of a stationary optimal policy and establish an
algorithm to compute it in a finite number of steps.
However, the bane of this and many similar problems is the resulting
complexity, and, in an attempt to make fundamental progress, we further propose
a new high reliability asymptotic approach. In essence, this approach considers
the scenario when the channel failure probabilities for different clients are
of the same order, and asymptotically approach zero. We thus proceed to
determine the asymptotically optimal policy: in a two-client scenario, we show
that the asymptotically optimal policy is a "modified least time-to-go" policy,
which is intuitively appealing and easily implementable; in the general
multi-client scenario, we are led to an SN policy, and we develop an algorithm
of low computational complexity to obtain it. Simulation results show that the
resulting policies perform well even in the pre-asymptotic regime with moderate
failure probabilities
Minimizing the impact of EV charging on the electricity distribution network
The main objective of this paper is to design electric vehicle (EV) charging
policies which minimize the impact of charging on the electricity distribution
network (DN). More precisely, the considered cost function results from a
linear combination of two parts: a cost with memory and a memoryless cost. In
this paper, the first component is identified to be the transformer ageing
while the second one corresponds to distribution Joule losses. First, we
formulate the problem as a non-trivial discrete-time optimal control problem
with finite time horizon. It is non-trivial because of the presence of
saturation constraints and a non-quadratic cost. It turns out that the system
state, which is the transformer hot-spot (HS) temperature here, can be
expressed as a function of the sequence of control variables; the cost function
is then seen to be convex in the control for typical values for the model
parameters. The problem of interest thus becomes a standard optimization
problem. While the corresponding problem can be solved by using available
numerical routines, three distributed charging policies are provided. The
motivation is threefold: to decrease the computational complexity; to model the
important scenario where the charging profile is chosen by the EV itself; to
circumvent the allocation problem which arises with the proposed formulation.
Remarkably, the performance loss induced by decentralization is verified to be
small through simulations. Numerical results show the importance of the choice
of the charging policies. For instance, the gain in terms of transformer
lifetime can be very significant when implementing advanced charging policies
instead of plug-and-charge policies. The impact of the accuracy of the non-EV
demand forecasting is equally assessed.Comment: 6 pages, 3 figures, keywords: electric vehicle charging, electricity
distribution network, optimal control, distributed policies, game theor
Near-Optimal Sample Complexity Bounds for Constrained MDPs
In contrast to the advances in characterizing the sample complexity for
solving Markov decision processes (MDPs), the optimal statistical complexity
for solving constrained MDPs (CMDPs) remains unknown. We resolve this question
by providing minimax upper and lower bounds on the sample complexity for
learning near-optimal policies in a discounted CMDP with access to a generative
model (simulator). In particular, we design a model-based algorithm that
addresses two settings: (i) relaxed feasibility, where small constraint
violations are allowed, and (ii) strict feasibility, where the output policy is
required to satisfy the constraint. For (i), we prove that our algorithm
returns an -optimal policy with probability , by making
queries to the generative model, thus matching the sample-complexity for
unconstrained MDPs. For (ii), we show that the algorithm's sample complexity is
upper-bounded by where is the problem-dependent Slater
constant that characterizes the size of the feasible region. Finally, we prove
a matching lower-bound for the strict feasibility setting, thus obtaining the
first near minimax optimal bounds for discounted CMDPs. Our results show that
learning CMDPs is as easy as MDPs when small constraint violations are allowed,
but inherently more difficult when we demand zero constraint violation.Comment: NeurIPS'2
- …