9,976 research outputs found
Big Neural Networks Waste Capacity
This article exposes the failure of some big neural networks to leverage
added capacity to reduce underfitting. Past research suggest diminishing
returns when increasing the size of neural networks. Our experiments on
ImageNet LSVRC-2010 show that this may be due to the fact there are highly
diminishing returns for capacity in terms of training error, leading to
underfitting. This suggests that the optimization method - first order gradient
descent - fails at this regime. Directly attacking this problem, either through
the optimization method or the choices of parametrization, may allow to improve
the generalization error on large datasets, for which a large capacity is
required
Evaluating Overfit and Underfit in Models of Network Community Structure
A common data mining task on networks is community detection, which seeks an
unsupervised decomposition of a network into structural groups based on
statistical regularities in the network's connectivity. Although many methods
exist, the No Free Lunch theorem for community detection implies that each
makes some kind of tradeoff, and no algorithm can be optimal on all inputs.
Thus, different algorithms will over or underfit on different inputs, finding
more, fewer, or just different communities than is optimal, and evaluation
methods that use a metadata partition as a ground truth will produce misleading
conclusions about general accuracy. Here, we present a broad evaluation of over
and underfitting in community detection, comparing the behavior of 16
state-of-the-art community detection algorithms on a novel and structurally
diverse corpus of 406 real-world networks. We find that (i) algorithms vary
widely both in the number of communities they find and in their corresponding
composition, given the same input, (ii) algorithms can be clustered into
distinct high-level groups based on similarities of their outputs on real-world
networks, and (iii) these differences induce wide variation in accuracy on link
prediction and link description tasks. We introduce a new diagnostic for
evaluating overfitting and underfitting in practice, and use it to roughly
divide community detection methods into general and specialized learning
algorithms. Across methods and inputs, Bayesian techniques based on the
stochastic block model and a minimum description length approach to
regularization represent the best general learning approach, but can be
outperformed under specific circumstances. These results introduce both a
theoretically principled approach to evaluate over and underfitting in models
of network community structure and a realistic benchmark by which new methods
may be evaluated and compared.Comment: 22 pages, 13 figures, 3 table
Weak-lensing shear estimates with general adaptive moments, and studies of bias by pixellation, PSF distortions, and noise
In weak gravitational lensing, weighted quadrupole moments of the brightness
profile in galaxy images are a common way to estimate gravitational shear. We
employ general adaptive moments (GLAM) to study causes of shear bias on a
fundamental level and for a practical definition of an image ellipticity. The
GLAM ellipticity has useful properties for any chosen weight profile: the
weighted ellipticity is identical to that of isophotes of elliptical images,
and in absence of noise and pixellation it is always an unbiased estimator of
reduced shear. We show that moment-based techniques, adaptive or unweighted,
are similar to a model-based approach in the sense that they can be seen as
imperfect fit of an elliptical profile to the image. Due to residuals in the
fit, moment-based estimates of ellipticities are prone to underfitting bias
when inferred from observed images. The estimation is fundamentally limited
mainly by pixellation which destroys information on the original, pre-seeing
image. We give an optimized estimator for the pre-seeing GLAM ellipticity and
quantify its bias for noise-free images. To deal with pixel noise, we consider
a Bayesian approach where the posterior of the GLAM ellipticity can be
inconsistent with the true ellipticity if we do not properly account for our
ignorance about fit residuals. This underfitting bias is S/N-independent but
changes with the pre-seeing brightness profile and the correlation or
heterogeneity of pixel noise over the post-seeing image. Furthermore, when
inferring a constant ellipticity or, more relevantly, constant shear from a
source sample with a distribution of intrinsic properties (sizes, centroid
positions, intrinsic shapes), an additional, now noise-dependent bias arises
towards low S/N if incorrect priors for the intrinsic properties are used. We
discuss the origin of this prior bias.Comment: 18 pages; 5 figures; accepted by A&A after major revision, especially
of Sect. 3.3 that corrects the previous discussion on the bias by
marginalizatio
Improving PSF modelling for weak gravitational lensing using new methods in model selection
A simple theoretical framework for the description and interpretation of
spatially correlated modelling residuals is presented, and the resulting tools
are found to provide a useful aid to model selection in the context of weak
gravitational lensing. The description is focused upon the specific problem of
modelling the spatial variation of a telescope point spread function (PSF)
across the instrument field of view, a crucial stage in lensing data analysis,
but the technique may be used to rank competing models wherever data are
described empirically. As such it may, with further development, provide useful
extra information when used in combination with existing model selection
techniques such as the Akaike and Bayesian Information Criteria, or the
Bayesian evidence. Two independent diagnostic correlation functions are
described and the interpretation of these functions demonstrated using a
simulated PSF anisotropy field. The efficacy of these diagnostic functions as
an aid to the correct choice of empirical model is then demonstrated by
analyzing results for a suite of Monte Carlo simulations of random PSF fields
with varying degrees of spatial structure, and it is shown how the diagnostic
functions can be related to requirements for precision cosmic shear
measurement. The limitations of the technique, and opportunities for
improvements and applications to fields other than weak gravitational lensing,
are discussed.Comment: 18 pages, 12 figures. Modified to match version accepted for
publication in MNRA
Performance Evaluation of Channel Decoding With Deep Neural Networks
With the demand of high data rate and low latency in fifth generation (5G),
deep neural network decoder (NND) has become a promising candidate due to its
capability of one-shot decoding and parallel computing. In this paper, three
types of NND, i.e., multi-layer perceptron (MLP), convolution neural network
(CNN) and recurrent neural network (RNN), are proposed with the same parameter
magnitude. The performance of these deep neural networks are evaluated through
extensive simulation. Numerical results show that RNN has the best decoding
performance, yet at the price of the highest computational overhead. Moreover,
we find there exists a saturation length for each type of neural network, which
is caused by their restricted learning abilities.Comment: 6 pages, 11 figures, Latex; typos corrected; IEEE ICC 2018 to appea
- …