35 research outputs found
Fast-Convergent Learning-aided Control in Energy Harvesting Networks
In this paper, we present a novel learning-aided energy management scheme
() for multihop energy harvesting networks. Different from prior
works on this problem, our algorithm explicitly incorporates information
learning into system control via a step called \emph{perturbed dual learning}.
does not require any statistical information of the system
dynamics for implementation, and efficiently resolves the challenging energy
outage problem. We show that achieves the near-optimal
utility-delay tradeoff with an
energy buffers (). More interestingly,
possesses a \emph{convergence time} of , which is much faster than the time of
pure queue-based techniques or the time of approaches
that rely purely on learning the system statistics. This fast convergence
property makes more adaptive and efficient in resource
allocation in dynamic environments. The design and analysis of
demonstrate how system control algorithms can be augmented by learning and what
the benefits are. The methodology and algorithm can also be applied to similar
problems, e.g., processing networks, where nodes require nonzero amount of
contents to support their actions
When Backpressure Meets Predictive Scheduling
Motivated by the increasing popularity of learning and predicting human user
behavior in communication and computing systems, in this paper, we investigate
the fundamental benefit of predictive scheduling, i.e., predicting and
pre-serving arrivals, in controlled queueing systems. Based on a lookahead
window prediction model, we first establish a novel equivalence between the
predictive queueing system with a \emph{fully-efficient} scheduling scheme and
an equivalent queueing system without prediction. This connection allows us to
analytically demonstrate that predictive scheduling necessarily improves system
delay performance and can drive it to zero with increasing prediction power. We
then propose the \textsf{Predictive Backpressure (PBP)} algorithm for achieving
optimal utility performance in such predictive systems. \textsf{PBP}
efficiently incorporates prediction into stochastic system control and avoids
the great complication due to the exponential state space growth in the
prediction window size. We show that \textsf{PBP} can achieve a utility
performance that is within of the optimal, for any ,
while guaranteeing that the system delay distribution is a
\emph{shifted-to-the-left} version of that under the original Backpressure
algorithm. Hence, the average packet delay under \textsf{PBP} is strictly
better than that under Backpressure, and vanishes with increasing prediction
window size. This implies that the resulting utility-delay tradeoff with
predictive scheduling beats the known optimal tradeoff for systems without prediction
The Value-of-Information in Matching with Queues
We consider the problem of \emph{optimal matching with queues} in dynamic
systems and investigate the value-of-information. In such systems, the
operators match tasks and resources stored in queues, with the objective of
maximizing the system utility of the matching reward profile, minus the average
matching cost. This problem appears in many practical systems and the main
challenges are the no-underflow constraints, and the lack of matching-reward
information and system dynamics statistics. We develop two online matching
algorithms: Learning-aided Reward optimAl Matching () and
Dual- () to effectively resolve both challenges.
Both algorithms are equipped with a learning module for estimating the
matching-reward information, while incorporates an additional
module for learning the system dynamics. We show that both algorithms achieve
an close-to-optimal utility performance for any
, while achieves a faster convergence speed and a
better delay compared to , i.e., delay and convergence under
compared to delay and convergence under
( and are maximum estimation errors for
reward and system dynamics). Our results reveal that information of different
system components can play very different roles in algorithm performance and
provide a systematic way for designing joint learning-control algorithms for
dynamic systems