140,025 research outputs found
Optimal Control for Generalized Network-Flow Problems
We consider the problem of throughput-optimal packet dissemination, in the presence of an arbitrary mix of unicast, broadcast, multicast, and anycast traffic, in an arbitrary wireless network. We propose an online dynamic policy, called Universal Max-Weight (UMW), which solves the problem efficiently. To the best of our knowledge, UMW is the first known throughput-optimal policy of such versatility in the context of generalized network flow problems. Conceptually, the UMW policy is derived by relaxing the precedence constraints associated with multi-hop routing and then solving a min-cost routing and max-weight scheduling problem on a virtual network of queues. When specialized to the unicast setting, the UMW policy yields a throughput-optimal cycle-free routing and link scheduling policy. This is in contrast with the well-known throughput-optimal back-pressure (BP) policy which allows for packet cycling, resulting in excessive latency. Extensive simulation results show that the proposed UMW policy incurs a substantially smaller delay as compared with the BP policy. The proof of throughput-optimality of the UMW policy combines ideas from the stochastic Lyapunov theory with a sample path argument from adversarial queueing theory and may be of independent theoretical interest
Recommended from our members
Selfish Optimization in Computer Networks
This paper describes two applications of decentralized (Pareto) optimization to problems of computer communication networks. The first application is to develop a generalized principle for optimality of multi-hop broadcast channel access schemes. The second application is to decentralized flow-control in fixed virtual-circuit networks (e.g., SNA) using power maximization as the performance index. The decentralized approach to optimum network behavior yields, among other results, characterization of fair global objective functions, and optimal decentralized greedy network control algorithms. The main conclusion of this paper is that Pareto-optimality methods can be successfully used to develop optimal decentralized behavior algorithms where a centralized approach is (sometimes provably) not applicable
Reinforcement Learning with Non-Cumulative Objective
In reinforcement learning, the objective is almost always defined as a
\emph{cumulative} function over the rewards along the process. However, there
are many optimal control and reinforcement learning problems in various
application fields, especially in communications and networking, where the
objectives are not naturally expressed as summations of the rewards. In this
paper, we recognize the prevalence of non-cumulative objectives in various
problems, and propose a modification to existing algorithms for optimizing such
objectives. Specifically, we dive into the fundamental building block for many
optimal control and reinforcement learning algorithms: the Bellman optimality
equation. To optimize a non-cumulative objective, we replace the original
summation operation in the Bellman update rule with a generalized operation
corresponding to the objective. Furthermore, we provide sufficient conditions
on the form of the generalized operation as well as assumptions on the Markov
decision process under which the globally optimal convergence of the
generalized Bellman updates can be guaranteed. We demonstrate the idea
experimentally with the bottleneck objective, i.e., the objectives determined
by the minimum reward along the process, on classical optimal control and
reinforcement learning tasks, as well as on two network routing problems on
maximizing the flow rates.Comment: 13 pages, 6 figures. To appear in IEEE Transactions on Machine
Learning in Communications and Networking (TMLCN
Optimal pricing control in distribution networks with time-varying supply and demand
This paper studies the problem of optimal flow control in dynamic inventory
systems. A dynamic optimal distribution problem, including time-varying supply
and demand, capacity constraints on the transportation lines, and convex flow
cost functions of Legendre-type, is formalized and solved. The time-varying
optimal flow is characterized in terms of the time-varying dual variables of a
corresponding network optimization problem. A dynamic feedback controller is
proposed that regulates the flows asymptotically to the optimal flows and
achieves in addition a balancing of all storage levels.Comment: Submitted to 21st International Symposium on Mathematical Theory of
Networks and Systems (MTNS) in December 201
- …