34,020 research outputs found
Shaping the learning landscape in neural networks around wide flat minima
Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex
high-dimensional loss function, typically by a stochastic gradient descent
(SGD) strategy. The learning process is observed to be able to find good
minimizers without getting stuck in local critical points, and that such
minimizers are often satisfactory at avoiding overfitting. How these two
features can be kept under control in nonlinear devices composed of millions of
tunable connections is a profound and far reaching open question. In this paper
we study basic non-convex one- and two-layer neural network models which learn
random patterns, and derive a number of basic geometrical and algorithmic
features which suggest some answers. We first show that the error loss function
presents few extremely wide flat minima (WFM) which coexist with narrower
minima and critical points. We then show that the minimizers of the
cross-entropy loss function overlap with the WFM of the error loss. We also
show examples of learning devices for which WFM do not exist. From the
algorithmic perspective we derive entropy driven greedy and message passing
algorithms which focus their search on wide flat regions of minimizers. In the
case of SGD and cross-entropy loss, we show that a slow reduction of the norm
of the weights along the learning process also leads to WFM. We corroborate the
results by a numerical study of the correlations between the volumes of the
minimizers, their Hessian and their generalization performance on real data.Comment: 37 pages (16 main text), 10 figures (7 main text
Computational core and fixed-point organisation in Boolean networks
In this paper, we analyse large random Boolean networks in terms of a
constraint satisfaction problem. We first develop an algorithmic scheme which
allows to prune simple logical cascades and under-determined variables,
returning thereby the computational core of the network. Second we apply the
cavity method to analyse number and organisation of fixed points. We find in
particular a phase transition between an easy and a complex regulatory phase,
the latter one being characterised by the existence of an exponential number of
macroscopically separated fixed-point clusters. The different techniques
developed are reinterpreted as algorithms for the analysis of single Boolean
networks, and they are applied to analysis and in silico experiments on the
gene-regulatory networks of baker's yeast (saccaromices cerevisiae) and the
segment-polarity genes of the fruit-fly drosophila melanogaster.Comment: 29 pages, 18 figures, version accepted for publication in JSTA
Deploy-As-You-Go Wireless Relay Placement: An Optimal Sequential Decision Approach using the Multi-Relay Channel Model
We use information theoretic achievable rate formulas for the multi-relay
channel to study the problem of as-you-go deployment of relay nodes. The
achievable rate formulas are for full-duplex radios at the relays and for
decode-and-forward relaying. Deployment is done along the straight line joining
a source node and a sink node at an unknown distance from the source. The
problem is for a deployment agent to walk from the source to the sink,
deploying relays as he walks, given that the distance to the sink is
exponentially distributed with known mean. As a precursor, we apply the
multi-relay channel achievable rate formula to obtain the optimal power
allocation to relays placed along a line, at fixed locations. This permits us
to obtain the optimal placement of a given number of nodes when the distance
between the source and sink is given. Numerical work suggests that, at low
attenuation, the relays are mostly clustered near the source in order to be
able to cooperate, whereas at high attenuation they are uniformly placed and
work as repeaters. We also prove that the effect of path-loss can be entirely
mitigated if a large enough number of relays are placed uniformly between the
source and the sink. The structure of the optimal power allocation for a given
placement of the nodes, then motivates us to formulate the problem of as-you-go
placement of relays along a line of exponentially distributed length, and with
the exponential path-loss model, so as to minimize a cost function that is
additive over hops. The hop cost trades off a capacity limiting term, motivated
from the optimal power allocation solution, against the cost of adding a relay
node. We formulate the problem as a total cost Markov decision process,
establish results for the value function, and provide insights into the
placement policy and the performance of the deployed network via numerical
exploration.Comment: 21 pages. arXiv admin note: substantial text overlap with
arXiv:1204.432
Sequential Decision Algorithms for Measurement-Based Impromptu Deployment of a Wireless Relay Network along a Line
We are motivated by the need, in some applications, for impromptu or
as-you-go deployment of wireless sensor networks. A person walks along a line,
starting from a sink node (e.g., a base-station), and proceeds towards a source
node (e.g., a sensor) which is at an a priori unknown location. At equally
spaced locations, he makes link quality measurements to the previous relay, and
deploys relays at some of these locations, with the aim to connect the source
to the sink by a multihop wireless path. In this paper, we consider two
approaches for impromptu deployment: (i) the deployment agent can only move
forward (which we call a pure as-you-go approach), and (ii) the deployment
agent can make measurements over several consecutive steps before selecting a
placement location among them (which we call an explore-forward approach). We
consider a light traffic regime, and formulate the problem as a Markov decision
process, where the trade-off is among the power used by the nodes, the outage
probabilities in the links, and the number of relays placed per unit distance.
We obtain the structures of the optimal policies for the pure as-you-go
approach as well as for the explore-forward approach. We also consider natural
heuristic algorithms, for comparison. Numerical examples show that the
explore-forward approach significantly outperforms the pure as-you-go approach.
Next, we propose two learning algorithms for the explore-forward approach,
based on Stochastic Approximation, which asymptotically converge to the set of
optimal policies, without using any knowledge of the radio propagation model.
We demonstrate numerically that the learning algorithms can converge (as
deployment progresses) to the set of optimal policies reasonably fast and,
hence, can be practical, model-free algorithms for deployment over large
regions.Comment: 29 pages. arXiv admin note: text overlap with arXiv:1308.068
Controlling overestimation of error covariance in ensemble Kalman filters with sparse observations: A variance limiting Kalman filter
We consider the problem of an ensemble Kalman filter when only partial
observations are available. In particular we consider the situation where the
observational space consists of variables which are directly observable with
known observational error, and of variables of which only their climatic
variance and mean are given. To limit the variance of the latter poorly
resolved variables we derive a variance limiting Kalman filter (VLKF) in a
variational setting. We analyze the variance limiting Kalman filter for a
simple linear toy model and determine its range of optimal performance. We
explore the variance limiting Kalman filter in an ensemble transform setting
for the Lorenz-96 system, and show that incorporating the information of the
variance of some un-observable variables can improve the skill and also
increase the stability of the data assimilation procedure.Comment: 32 pages, 11 figure
- …