175 research outputs found

    Learning-aided Stochastic Network Optimization with Imperfect State Prediction

    Full text link
    We investigate the problem of stochastic network optimization in the presence of imperfect state prediction and non-stationarity. Based on a novel distribution-accuracy curve prediction model, we develop the predictive learning-aided control (PLC) algorithm, which jointly utilizes historic and predicted network state information for decision making. PLC is an online algorithm that requires zero a-prior system statistical information, and consists of three key components, namely sequential distribution estimation and change detection, dual learning, and online queue-based control. Specifically, we show that PLC simultaneously achieves good long-term performance, short-term queue size reduction, accurate change detection, and fast algorithm convergence. In particular, for stationary networks, PLC achieves a near-optimal [O(ϵ)[O(\epsilon), O(log(1/ϵ)2)]O(\log(1/\epsilon)^2)] utility-delay tradeoff. For non-stationary networks, \plc{} obtains an [O(ϵ),O(log2(1/ϵ)[O(\epsilon), O(\log^2(1/\epsilon) +min(ϵc/21,ew/ϵ))]+ \min(\epsilon^{c/2-1}, e_w/\epsilon))] utility-backlog tradeoff for distributions that last Θ(max(ϵc,ew2)ϵ1+a)\Theta(\frac{\max(\epsilon^{-c}, e_w^{-2})}{\epsilon^{1+a}}) time, where ewe_w is the prediction accuracy and a=Θ(1)>0a=\Theta(1)>0 is a constant (the Backpressue algorithm \cite{neelynowbook} requires an O(ϵ2)O(\epsilon^{-2}) length for the same utility performance with a larger backlog). Moreover, PLC detects distribution change O(w)O(w) slots faster with high probability (ww is the prediction size) and achieves an O(min(ϵ1+c/2,ew/ϵ)+log2(1/ϵ))O(\min(\epsilon^{-1+c/2}, e_w/\epsilon)+\log^2(1/\epsilon)) convergence time. Our results demonstrate that state prediction (even imperfect) can help (i) achieve faster detection and convergence, and (ii) obtain better utility-delay tradeoffs

    Dynamic Control of Tunable Sub-optimal Algorithms for Scheduling of Time-varying Wireless Networks

    Full text link
    It is well known that for ergodic channel processes the Generalized Max-Weight Matching (GMWM) scheduling policy stabilizes the network for any supportable arrival rate vector within the network capacity region. This policy, however, often requires the solution of an NP-hard optimization problem. This has motivated many researchers to develop sub-optimal algorithms that approximate the GMWM policy in selecting schedule vectors. One implicit assumption commonly shared in this context is that during the algorithm runtime, the channel states remain effectively unchanged. This assumption may not hold as the time needed to select near-optimal schedule vectors usually increases quickly with the network size. In this paper, we incorporate channel variations and the time-efficiency of sub-optimal algorithms into the scheduler design, to dynamically tune the algorithm runtime considering the tradeoff between algorithm efficiency and its robustness to changing channel states. Specifically, we propose a Dynamic Control Policy (DCP) that operates on top of a given sub-optimal algorithm, and dynamically but in a large time-scale adjusts the time given to the algorithm according to queue backlog and channel correlations. This policy does not require knowledge of the structure of the given sub-optimal algorithm, and with low overhead can be implemented in a distributed manner. Using a novel Lyapunov analysis, we characterize the throughput stability region induced by DCP and show that our characterization can be tight. We also show that the throughput stability region of DCP is at least as large as that of any other static policy. Finally, we provide two case studies to gain further intuition into the performance of DCP.Comment: Submitted for journal consideration. A shorter version was presented in IEEE IWQoS 200

    High stable and accurate vehicle selection scheme based on federated edge learning in vehicular networks

    Full text link
    Federated edge learning (FEEL) technology for vehicular networks is considered as a promising technology to reduce the computation workload while keep the privacy of users. In the FEEL system, vehicles upload data to the edge servers, which train the vehicles' data to update local models and then return the result to vehicles to avoid sharing the original data. However, the cache queue in the edge is limited and the channel between edge server and each vehicle is a time varying wireless channel, which makes a challenge to select a suitable number of vehicles to upload data to keep a stable cache queue in edge server and maximize the learning accuracy. Moreover, selecting vehicles with different resource statuses to update data will affect the total amount of data involved in training, which further affects the model accuracy. In this paper, we propose a vehicle selection scheme, which maximizes the learning accuracy while ensuring the stability of the cache queue, where the statuses of all the vehicles in the coverage of edge server are taken into account. The performance of this scheme is evaluated through simulation experiments, which indicates that our proposed scheme can perform better than the known benchmark scheme.Comment: This paper has been submitted to China Communication

    The Value-of-Information in Matching with Queues

    Full text link
    We consider the problem of \emph{optimal matching with queues} in dynamic systems and investigate the value-of-information. In such systems, the operators match tasks and resources stored in queues, with the objective of maximizing the system utility of the matching reward profile, minus the average matching cost. This problem appears in many practical systems and the main challenges are the no-underflow constraints, and the lack of matching-reward information and system dynamics statistics. We develop two online matching algorithms: Learning-aided Reward optimAl Matching (LRAM\mathtt{LRAM}) and Dual-LRAM\mathtt{LRAM} (DRAM\mathtt{DRAM}) to effectively resolve both challenges. Both algorithms are equipped with a learning module for estimating the matching-reward information, while DRAM\mathtt{DRAM} incorporates an additional module for learning the system dynamics. We show that both algorithms achieve an O(ϵ+δr)O(\epsilon+\delta_r) close-to-optimal utility performance for any ϵ>0\epsilon>0, while DRAM\mathtt{DRAM} achieves a faster convergence speed and a better delay compared to LRAM\mathtt{LRAM}, i.e., O(δz/ϵ+log(1/ϵ)2))O(\delta_{z}/\epsilon + \log(1/\epsilon)^2)) delay and O(δz/ϵ)O(\delta_z/\epsilon) convergence under DRAM\mathtt{DRAM} compared to O(1/ϵ)O(1/\epsilon) delay and convergence under LRAM\mathtt{LRAM} (δr\delta_r and δz\delta_z are maximum estimation errors for reward and system dynamics). Our results reveal that information of different system components can play very different roles in algorithm performance and provide a systematic way for designing joint learning-control algorithms for dynamic systems

    Fast-Convergent Learning-aided Control in Energy Harvesting Networks

    Full text link
    In this paper, we present a novel learning-aided energy management scheme (LEM\mathtt{LEM}) for multihop energy harvesting networks. Different from prior works on this problem, our algorithm explicitly incorporates information learning into system control via a step called \emph{perturbed dual learning}. LEM\mathtt{LEM} does not require any statistical information of the system dynamics for implementation, and efficiently resolves the challenging energy outage problem. We show that LEM\mathtt{LEM} achieves the near-optimal [O(ϵ),O(log(1/ϵ)2)][O(\epsilon), O(\log(1/\epsilon)^2)] utility-delay tradeoff with an O(1/ϵ1c/2)O(1/\epsilon^{1-c/2}) energy buffers (c(0,1)c\in(0,1)). More interestingly, LEM\mathtt{LEM} possesses a \emph{convergence time} of O(1/ϵ1c/2+1/ϵc)O(1/\epsilon^{1-c/2} +1/\epsilon^c), which is much faster than the Θ(1/ϵ)\Theta(1/\epsilon) time of pure queue-based techniques or the Θ(1/ϵ2)\Theta(1/\epsilon^2) time of approaches that rely purely on learning the system statistics. This fast convergence property makes LEM\mathtt{LEM} more adaptive and efficient in resource allocation in dynamic environments. The design and analysis of LEM\mathtt{LEM} demonstrate how system control algorithms can be augmented by learning and what the benefits are. The methodology and algorithm can also be applied to similar problems, e.g., processing networks, where nodes require nonzero amount of contents to support their actions
    corecore