576 research outputs found
Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation
In this paper, we study the dynamics of temporal difference learning with
neural network-based value function approximation over a general state space,
namely, \emph{Neural TD learning}. We consider two practically used algorithms,
projection-free and max-norm regularized Neural TD learning, and establish the
first convergence bounds for these algorithms. An interesting observation from
our results is that max-norm regularization can dramatically improve the
performance of TD learning algorithms, both in terms of sample complexity and
overparameterization. In particular, we prove that max-norm regularization
appears to be more effective than -regularization, again both in terms
of sample complexity and overparameterization. The results in this work rely on
a novel Lyapunov drift analysis of the network parameters as a stopped and
controlled random process
Disentangling feature and lazy training in deep neural networks
Two distinct limits for deep learning have been derived as the network width
, depending on how the weights of the last layer scale
with . In the Neural Tangent Kernel (NTK) limit, the dynamics becomes linear
in the weights and is described by a frozen kernel . By contrast, in
the Mean-Field limit, the dynamics can be expressed in terms of the
distribution of the parameters associated with a neuron, that follows a partial
differential equation. In this work we consider deep networks where the weights
in the last layer scale as at initialization. By varying
and , we probe the crossover between the two limits. We observe the
previously identified regimes of lazy training and feature training. In the
lazy-training regime, the dynamics is almost linear and the NTK barely changes
after initialization. The feature-training regime includes the mean-field
formulation as a limiting case and is characterized by a kernel that evolves in
time, and learns some features. We perform numerical experiments on MNIST,
Fashion-MNIST, EMNIST and CIFAR10 and consider various architectures. We find
that (i) The two regimes are separated by an that scales as
. (ii) Network architecture and data structure play an important role
in determining which regime is better: in our tests, fully-connected networks
perform generally better in the lazy-training regime, unlike convolutional
networks. (iii) In both regimes, the fluctuations induced on the
learned function by initial conditions decay as ,
leading to a performance that increases with . The same improvement can also
be obtained at an intermediate width by ensemble-averaging several networks.
(iv) In the feature-training regime we identify a time scale
, such that for the dynamics is linear.Comment: minor revision
Aligned and oblique dynamics in recurrent neural networks
The relation between neural activity and behaviorally relevant variables is
at the heart of neuroscience research. When strong, this relation is termed a
neural representation. There is increasing evidence, however, for partial
dissociations between activity in an area and relevant external variables.
While many explanations have been proposed, a theoretical framework for the
relationship between external and internal variables is lacking. Here, we
utilize recurrent neural networks (RNNs) to explore the question of when and
how neural dynamics and the network's output are related from a geometrical
point of view. We find that RNNs can operate in two regimes: dynamics can
either be aligned with the directions that generate output variables, or
oblique to them. We show that the magnitude of the readout weights can serve as
a control knob between the regimes. Importantly, these regimes are functionally
distinct. Oblique networks are more heterogeneous and suppress noise in their
output directions. They are furthermore more robust to perturbations along the
output directions. Finally, we show that the two regimes can be dissociated in
neural recordings. Altogether, our results open a new perspective for
interpreting neural activity by relating network dynamics and their output.Comment: 52 pages, 29 figures, submitted for revie
Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Future wireless networks have a substantial potential in terms of supporting
a broad range of complex compelling applications both in military and civilian
fields, where the users are able to enjoy high-rate, low-latency, low-cost and
reliable information services. Achieving this ambitious goal requires new radio
techniques for adaptive learning and intelligent decision making because of the
complex heterogeneous nature of the network structures and wireless services.
Machine learning (ML) algorithms have great success in supporting big data
analytics, efficient parameter estimation and interactive decision making.
Hence, in this article, we review the thirty-year history of ML by elaborating
on supervised learning, unsupervised learning, reinforcement learning and deep
learning. Furthermore, we investigate their employment in the compelling
applications of wireless networks, including heterogeneous networks (HetNets),
cognitive radios (CR), Internet of things (IoT), machine to machine networks
(M2M), and so on. This article aims for assisting the readers in clarifying the
motivation and methodology of the various ML algorithms, so as to invoke them
for hitherto unexplored services as well as scenarios of future wireless
networks.Comment: 46 pages, 22 fig
- …