3,593 research outputs found
Convergence Analysis of Mixed Timescale Cross-Layer Stochastic Optimization
This paper considers a cross-layer optimization problem driven by
multi-timescale stochastic exogenous processes in wireless communication
networks. Due to the hierarchical information structure in a wireless network,
a mixed timescale stochastic iterative algorithm is proposed to track the
time-varying optimal solution of the cross-layer optimization problem, where
the variables are partitioned into short-term controls updated in a faster
timescale, and long-term controls updated in a slower timescale. We focus on
establishing a convergence analysis framework for such multi-timescale
algorithms, which is difficult due to the timescale separation of the algorithm
and the time-varying nature of the exogenous processes. To cope with this
challenge, we model the algorithm dynamics using stochastic differential
equations (SDEs) and show that the study of the algorithm convergence is
equivalent to the study of the stochastic stability of a virtual stochastic
dynamic system (VSDS). Leveraging the techniques of Lyapunov stability, we
derive a sufficient condition for the algorithm stability and a tracking error
bound in terms of the parameters of the multi-timescale exogenous processes.
Based on these results, an adaptive compensation algorithm is proposed to
enhance the tracking performance. Finally, we illustrate the framework by an
application example in wireless heterogeneous network
Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning
Multicasting in wireless systems is a natural way to exploit the redundancy
in user requests in a Content Centric Network. Power control and optimal
scheduling can significantly improve the wireless multicast network's
performance under fading. However, the model based approaches for power control
and scheduling studied earlier are not scalable to large state space or
changing system dynamics. In this paper, we use deep reinforcement learning
where we use function approximation of the Q-function via a deep neural network
to obtain a power control policy that matches the optimal policy for a small
network. We show that power control policy can be learnt for reasonably large
systems via this approach. Further we use multi-timescale stochastic
optimization to maintain the average power constraint. We demonstrate that a
slight modification of the learning algorithm allows tracking of time varying
system statistics. Finally, we extend the multi-timescale approach to
simultaneously learn the optimal queueing strategy along with power control. We
demonstrate scalability, tracking and cross layer optimization capabilities of
our algorithms via simulations. The proposed multi-timescale approach can be
used in general large state space dynamical systems with multiple objectives
and constraints, and may be of independent interest.Comment: arXiv admin note: substantial text overlap with arXiv:1910.0530
Flatter, faster: scaling momentum for optimal speedup of SGD
Commonly used optimization algorithms often show a trade-off between good
generalization and fast training times. For instance, stochastic gradient
descent (SGD) tends to have good generalization; however, adaptive gradient
methods have superior training times. Momentum can help accelerate training
with SGD, but so far there has been no principled way to select the momentum
hyperparameter. Here we study training dynamics arising from the interplay
between SGD with label noise and momentum in the training of overparametrized
neural networks. We find that scaling the momentum hyperparameter
with the learning rate to the power of maximally accelerates training,
without sacrificing generalization. To analytically derive this result we
develop an architecture-independent framework, where the main assumption is the
existence of a degenerate manifold of global minimizers, as is natural in
overparametrized models. Training dynamics display the emergence of two
characteristic timescales that are well-separated for generic values of the
hyperparameters. The maximum acceleration of training is reached when these two
timescales meet, which in turn determines the scaling limit we propose. We
confirm our scaling rule for synthetic regression problems (matrix sensing and
teacher-student paradigm) and classification for realistic datasets (ResNet-18
on CIFAR10, 6-layer MLP on FashionMNIST), suggesting the robustness of our
scaling rule to variations in architectures and datasets.Comment: v2: expanded introduction section, corrected minor typos. v1: 12+13
pages, 3 figure
A streamwise-constant model of turbulent pipe flow
A streamwise-constant model is presented to investigate the basic mechanisms
responsible for the change in mean flow occuring during pipe flow transition.
Using a single forced momentum balance equation, we show that the shape of the
velocity profile is robust to changes in the forcing profile and that both
linear non-normal and nonlinear effects are required to capture the change in
mean flow associated with transition to turbulence. The particularly simple
form of the model allows for the study of the momentum transfer directly by
inspection of the equations. The distribution of the high- and low-speed
streaks over the cross-section of the pipe produced by our model is remarkably
similar to one observed in the velocity field near the trailing edge of the
puff structures present in pipe flow transition. Under stochastic forcing, the
model exhibits a quasi-periodic self-sustaining cycle characterized by the
creation and subsequent decay of "streamwise-constant puffs", so-called due to
the good agreement between the temporal evolution of their velocity field and
the projection of the velocity field associated with three-dimensional puffs in
a frame of reference moving at the bulk velocity. We establish that the flow
dynamics are relatively insensitive to the regeneration mechanisms invoked to
produce near-wall streamwise vortices and that using small, unstructured
background disturbances to regenerate the streamwise vortices is sufficient to
capture the formation of the high- and low-speed streaks and their segregation
leading to the blunting of the velocity profile characteristic of turbulent
pipe flow
Learning for Cross-layer Resource Allocation in the Framework of Cognitive Wireless Networks
The framework of cognitive wireless networks is expected to endow wireless devices with a cognition-intelligence ability with which they can efficiently learn and respond to the dynamic wireless environment. In this dissertation, we focus on the problem of developing cognitive network control mechanisms without knowing in advance an accurate network model. We study a series of cross-layer resource allocation problems in cognitive wireless networks. Based on model-free learning, optimization and game theory, we propose a framework of self-organized, adaptive strategy learning for wireless devices to (implicitly) build the understanding of the network dynamics through trial-and-error.
The work of this dissertation is divided into three parts. In the first part, we investigate a distributed, single-agent decision-making problem for real-time video streaming over a time-varying wireless channel between a single pair of transmitter and receiver. By modeling the joint source-channel resource allocation process for video streaming as a constrained Markov decision process, we propose a reinforcement learning scheme to search for the optimal transmission policy without the need to know in advance the details of network dynamics.
In the second part of this work, we extend our study from the single-agent to a multi-agent decision-making scenario, and study the energy-efficient power allocation problems in a two-tier, underlay heterogeneous network and in a self-sustainable green network. For the heterogeneous network, we propose a stochastic learning algorithm based on repeated games to allow individual macro- or femto-users to find a Stackelberg equilibrium without flooding the network with local action information. For the self-sustainable green network, we propose a combinatorial auction mechanism that allows mobile stations to adaptively choose the optimal base station and sub-carrier group for transmission only from local payoff and transmission strategy information.
In the third part of this work, we study a cross-layer routing problem in an interweaved Cognitive Radio Network (CRN), where an accurate network model is not available and the secondary users that are distributed within the CRN only have access to local action/utility information. In order to develop a spectrum-aware routing mechanism that is robust against potential insider attackers, we model the uncoordinated interaction between CRN nodes in the dynamic wireless environment as a stochastic game. Through decomposition of the stochastic routing game, we propose two stochastic learning algorithm based on a group of repeated stage games for the secondary users to learn the best-response strategies without the need of information flooding
- …