263,945 research outputs found

    Optimal admission policies for small star networks

    Get PDF
    In this thesis admission stationary policies for small Symmetric Star telecommunication networks in which there are two types of calls requesting access are considered. Arrivals form independent Poisson streams on each route. We consider the routing to be fixed. The holding times of the calls are exponentially distributed periods of time. Rewards are earned for carrying calls and future returns are discounted at a fixed rate. The operation of the network is viewed as a Markov Decision Process and we solve the optimality equation for this network model numerically for a range of small examples by using the policy improvement algorithm of Dynamic Programming. The optimal policies we study involve acceptance or rejection of traffic requests in order to maximise the Total Expected Discounted Reward. Our Star networks are in some respect the simplest networks more complex than single links in isolation but even so only very small examples can be treated numerically. From those examples we find evidence that suggests that despite their complexity, optimal policies have some interesting properties. Admission Price policies are also investigated in this thesis. These policies are not optimal but they are believed to be asymptotically optimal for large networks. In this thesis we investigate if such policies are any good for small networks; we suggest that they are. A reduced state-space model is also considered in which a call on a 2-link route, once accepted, is split into two independent calls on the links involved. This greatly reduces the size of the state-space. We present properties of the optimal policies and the Admission Price policies and conclude that they are very good for the examples considered. Finally we look at Asymmetric Star networks with different number of circuits per link and different exponential holding times. Properties of the optimal policies as well as Admission Price policies are investigated for such networks

    A High Reliability Asymptotic Approach for Packet Inter-Delivery Time Optimization in Cyber-Physical Systems

    Full text link
    In cyber-physical systems such as automobiles, measurement data from sensor nodes should be delivered to other consumer nodes such as actuators in a regular fashion. But, in practical systems over unreliable media such as wireless, it is a significant challenge to guarantee small enough inter-delivery times for different clients with heterogeneous channel conditions and inter-delivery requirements. In this paper, we design scheduling policies aiming at satisfying the inter-delivery requirements of such clients. We formulate the problem as a risk-sensitive Markov Decision Process (MDP). Although the resulting problem involves an infinite state space, we first prove that there is an equivalent MDP involving only a finite number of states. Then we prove the existence of a stationary optimal policy and establish an algorithm to compute it in a finite number of steps. However, the bane of this and many similar problems is the resulting complexity, and, in an attempt to make fundamental progress, we further propose a new high reliability asymptotic approach. In essence, this approach considers the scenario when the channel failure probabilities for different clients are of the same order, and asymptotically approach zero. We thus proceed to determine the asymptotically optimal policy: in a two-client scenario, we show that the asymptotically optimal policy is a "modified least time-to-go" policy, which is intuitively appealing and easily implementable; in the general multi-client scenario, we are led to an SN policy, and we develop an algorithm of low computational complexity to obtain it. Simulation results show that the resulting policies perform well even in the pre-asymptotic regime with moderate failure probabilities

    Minimizing the impact of EV charging on the electricity distribution network

    Full text link
    The main objective of this paper is to design electric vehicle (EV) charging policies which minimize the impact of charging on the electricity distribution network (DN). More precisely, the considered cost function results from a linear combination of two parts: a cost with memory and a memoryless cost. In this paper, the first component is identified to be the transformer ageing while the second one corresponds to distribution Joule losses. First, we formulate the problem as a non-trivial discrete-time optimal control problem with finite time horizon. It is non-trivial because of the presence of saturation constraints and a non-quadratic cost. It turns out that the system state, which is the transformer hot-spot (HS) temperature here, can be expressed as a function of the sequence of control variables; the cost function is then seen to be convex in the control for typical values for the model parameters. The problem of interest thus becomes a standard optimization problem. While the corresponding problem can be solved by using available numerical routines, three distributed charging policies are provided. The motivation is threefold: to decrease the computational complexity; to model the important scenario where the charging profile is chosen by the EV itself; to circumvent the allocation problem which arises with the proposed formulation. Remarkably, the performance loss induced by decentralization is verified to be small through simulations. Numerical results show the importance of the choice of the charging policies. For instance, the gain in terms of transformer lifetime can be very significant when implementing advanced charging policies instead of plug-and-charge policies. The impact of the accuracy of the non-EV demand forecasting is equally assessed.Comment: 6 pages, 3 figures, keywords: electric vehicle charging, electricity distribution network, optimal control, distributed policies, game theor

    Near-Optimal Sample Complexity Bounds for Constrained MDPs

    Full text link
    In contrast to the advances in characterizing the sample complexity for solving Markov decision processes (MDPs), the optimal statistical complexity for solving constrained MDPs (CMDPs) remains unknown. We resolve this question by providing minimax upper and lower bounds on the sample complexity for learning near-optimal policies in a discounted CMDP with access to a generative model (simulator). In particular, we design a model-based algorithm that addresses two settings: (i) relaxed feasibility, where small constraint violations are allowed, and (ii) strict feasibility, where the output policy is required to satisfy the constraint. For (i), we prove that our algorithm returns an ϵ\epsilon-optimal policy with probability 1δ1 - \delta, by making O~(SAlog(1/δ)(1γ)3ϵ2)\tilde{O}\left(\frac{S A \log(1/\delta)}{(1 - \gamma)^3 \epsilon^2}\right) queries to the generative model, thus matching the sample-complexity for unconstrained MDPs. For (ii), we show that the algorithm's sample complexity is upper-bounded by O~(SAlog(1/δ)(1γ)5ϵ2ζ2)\tilde{O} \left(\frac{S A \, \log(1/\delta)}{(1 - \gamma)^5 \, \epsilon^2 \zeta^2} \right) where ζ\zeta is the problem-dependent Slater constant that characterizes the size of the feasible region. Finally, we prove a matching lower-bound for the strict feasibility setting, thus obtaining the first near minimax optimal bounds for discounted CMDPs. Our results show that learning CMDPs is as easy as MDPs when small constraint violations are allowed, but inherently more difficult when we demand zero constraint violation.Comment: NeurIPS'2
    corecore