3 research outputs found
Learning Algorithms for Minimizing Queue Length Regret
We consider a system consisting of a single transmitter/receiver pair and
channels over which they may communicate. Packets randomly arrive to the
transmitter's queue and wait to be successfully sent to the receiver. The
transmitter may attempt a frame transmission on one channel at a time, where
each frame includes a packet if one is in the queue. For each channel, an
attempted transmission is successful with an unknown probability. The
transmitter's objective is to quickly identify the best channel to minimize the
number of packets in the queue over time slots. To analyze system
performance, we introduce queue length regret, which is the expected difference
between the total queue length of a learning policy and a controller that knows
the rates, a priori. One approach to designing a transmission policy would be
to apply algorithms from the literature that solve the closely-related
stochastic multi-armed bandit problem. These policies would focus on maximizing
the number of successful frame transmissions over time. However, we show that
these methods have queue length regret. On the other hand, we
show that there exists a set of queue-length based policies that can obtain
order optimal queue length regret. We use our theoretical analysis to
devise heuristic methods that are shown to perform well in simulation.Comment: 28 Pages, 11 figure
Minimizing Queue Length Regret Under Adversarial Network Models
Stochastic models have been dominant in network optimization theory for over two decades, due totheir analytical tractability. However, these models fail to capture non-stationary or even adversarialnetwork dynamics which are of increasing importance for modeling the behavior of networksunder malicious attacks or characterizing short-term transient behavior. In this paper, we focuson minimizing queue length regret under adversarial network models, which measures the finite-time queue length difference between a causal policy and an “oracle” that knows the future. Twoadversarial network models are developed to characterize the adversary’s behavior. We provide lowerbounds on queue length regret under these adversary models and analyze the performance of twocontrol policies (i.e., the MaxWeight policy and the Tracking Algorithm). We further characterizethe stability region under adversarial network models, and show that both the MaxWeight policyand the Tracking Algorithm are throughput-optimal even in adversarial settings.National Science Foundation (U.S.) (Grant CNS-1524317)United States. Defense Advanced Research Projects Agency. Information Innovation Office (Contract HROO l l-l 5-C-0097