175 research outputs found
Learning-aided Stochastic Network Optimization with Imperfect State Prediction
We investigate the problem of stochastic network optimization in the presence
of imperfect state prediction and non-stationarity. Based on a novel
distribution-accuracy curve prediction model, we develop the predictive
learning-aided control (PLC) algorithm, which jointly utilizes historic and
predicted network state information for decision making. PLC is an online
algorithm that requires zero a-prior system statistical information, and
consists of three key components, namely sequential distribution estimation and
change detection, dual learning, and online queue-based control.
Specifically, we show that PLC simultaneously achieves good long-term
performance, short-term queue size reduction, accurate change detection, and
fast algorithm convergence. In particular, for stationary networks, PLC
achieves a near-optimal , utility-delay
tradeoff. For non-stationary networks, \plc{} obtains an
utility-backlog tradeoff for distributions that last
time, where
is the prediction accuracy and is a constant (the
Backpressue algorithm \cite{neelynowbook} requires an length
for the same utility performance with a larger backlog). Moreover, PLC detects
distribution change slots faster with high probability ( is the
prediction size) and achieves an convergence time. Our results demonstrate
that state prediction (even imperfect) can help (i) achieve faster detection
and convergence, and (ii) obtain better utility-delay tradeoffs
Dynamic Control of Tunable Sub-optimal Algorithms for Scheduling of Time-varying Wireless Networks
It is well known that for ergodic channel processes the Generalized
Max-Weight Matching (GMWM) scheduling policy stabilizes the network for any
supportable arrival rate vector within the network capacity region. This
policy, however, often requires the solution of an NP-hard optimization
problem. This has motivated many researchers to develop sub-optimal algorithms
that approximate the GMWM policy in selecting schedule vectors. One implicit
assumption commonly shared in this context is that during the algorithm
runtime, the channel states remain effectively unchanged. This assumption may
not hold as the time needed to select near-optimal schedule vectors usually
increases quickly with the network size. In this paper, we incorporate channel
variations and the time-efficiency of sub-optimal algorithms into the scheduler
design, to dynamically tune the algorithm runtime considering the tradeoff
between algorithm efficiency and its robustness to changing channel states.
Specifically, we propose a Dynamic Control Policy (DCP) that operates on top of
a given sub-optimal algorithm, and dynamically but in a large time-scale
adjusts the time given to the algorithm according to queue backlog and channel
correlations. This policy does not require knowledge of the structure of the
given sub-optimal algorithm, and with low overhead can be implemented in a
distributed manner. Using a novel Lyapunov analysis, we characterize the
throughput stability region induced by DCP and show that our characterization
can be tight. We also show that the throughput stability region of DCP is at
least as large as that of any other static policy. Finally, we provide two case
studies to gain further intuition into the performance of DCP.Comment: Submitted for journal consideration. A shorter version was presented
in IEEE IWQoS 200
High stable and accurate vehicle selection scheme based on federated edge learning in vehicular networks
Federated edge learning (FEEL) technology for vehicular networks is
considered as a promising technology to reduce the computation workload while
keep the privacy of users. In the FEEL system, vehicles upload data to the edge
servers, which train the vehicles' data to update local models and then return
the result to vehicles to avoid sharing the original data. However, the cache
queue in the edge is limited and the channel between edge server and each
vehicle is a time varying wireless channel, which makes a challenge to select a
suitable number of vehicles to upload data to keep a stable cache queue in edge
server and maximize the learning accuracy. Moreover, selecting vehicles with
different resource statuses to update data will affect the total amount of data
involved in training, which further affects the model accuracy. In this paper,
we propose a vehicle selection scheme, which maximizes the learning accuracy
while ensuring the stability of the cache queue, where the statuses of all the
vehicles in the coverage of edge server are taken into account. The performance
of this scheme is evaluated through simulation experiments, which indicates
that our proposed scheme can perform better than the known benchmark scheme.Comment: This paper has been submitted to China Communication
The Value-of-Information in Matching with Queues
We consider the problem of \emph{optimal matching with queues} in dynamic
systems and investigate the value-of-information. In such systems, the
operators match tasks and resources stored in queues, with the objective of
maximizing the system utility of the matching reward profile, minus the average
matching cost. This problem appears in many practical systems and the main
challenges are the no-underflow constraints, and the lack of matching-reward
information and system dynamics statistics. We develop two online matching
algorithms: Learning-aided Reward optimAl Matching () and
Dual- () to effectively resolve both challenges.
Both algorithms are equipped with a learning module for estimating the
matching-reward information, while incorporates an additional
module for learning the system dynamics. We show that both algorithms achieve
an close-to-optimal utility performance for any
, while achieves a faster convergence speed and a
better delay compared to , i.e., delay and convergence under
compared to delay and convergence under
( and are maximum estimation errors for
reward and system dynamics). Our results reveal that information of different
system components can play very different roles in algorithm performance and
provide a systematic way for designing joint learning-control algorithms for
dynamic systems
Fast-Convergent Learning-aided Control in Energy Harvesting Networks
In this paper, we present a novel learning-aided energy management scheme
() for multihop energy harvesting networks. Different from prior
works on this problem, our algorithm explicitly incorporates information
learning into system control via a step called \emph{perturbed dual learning}.
does not require any statistical information of the system
dynamics for implementation, and efficiently resolves the challenging energy
outage problem. We show that achieves the near-optimal
utility-delay tradeoff with an
energy buffers (). More interestingly,
possesses a \emph{convergence time} of , which is much faster than the time of
pure queue-based techniques or the time of approaches
that rely purely on learning the system statistics. This fast convergence
property makes more adaptive and efficient in resource
allocation in dynamic environments. The design and analysis of
demonstrate how system control algorithms can be augmented by learning and what
the benefits are. The methodology and algorithm can also be applied to similar
problems, e.g., processing networks, where nodes require nonzero amount of
contents to support their actions
- …