9,976 research outputs found

    Big Neural Networks Waste Capacity

    Full text link
    This article exposes the failure of some big neural networks to leverage added capacity to reduce underfitting. Past research suggest diminishing returns when increasing the size of neural networks. Our experiments on ImageNet LSVRC-2010 show that this may be due to the fact there are highly diminishing returns for capacity in terms of training error, leading to underfitting. This suggests that the optimization method - first order gradient descent - fails at this regime. Directly attacking this problem, either through the optimization method or the choices of parametrization, may allow to improve the generalization error on large datasets, for which a large capacity is required

    Evaluating Overfit and Underfit in Models of Network Community Structure

    Full text link
    A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.Comment: 22 pages, 13 figures, 3 table

    Weak-lensing shear estimates with general adaptive moments, and studies of bias by pixellation, PSF distortions, and noise

    Full text link
    In weak gravitational lensing, weighted quadrupole moments of the brightness profile in galaxy images are a common way to estimate gravitational shear. We employ general adaptive moments (GLAM) to study causes of shear bias on a fundamental level and for a practical definition of an image ellipticity. The GLAM ellipticity has useful properties for any chosen weight profile: the weighted ellipticity is identical to that of isophotes of elliptical images, and in absence of noise and pixellation it is always an unbiased estimator of reduced shear. We show that moment-based techniques, adaptive or unweighted, are similar to a model-based approach in the sense that they can be seen as imperfect fit of an elliptical profile to the image. Due to residuals in the fit, moment-based estimates of ellipticities are prone to underfitting bias when inferred from observed images. The estimation is fundamentally limited mainly by pixellation which destroys information on the original, pre-seeing image. We give an optimized estimator for the pre-seeing GLAM ellipticity and quantify its bias for noise-free images. To deal with pixel noise, we consider a Bayesian approach where the posterior of the GLAM ellipticity can be inconsistent with the true ellipticity if we do not properly account for our ignorance about fit residuals. This underfitting bias is S/N-independent but changes with the pre-seeing brightness profile and the correlation or heterogeneity of pixel noise over the post-seeing image. Furthermore, when inferring a constant ellipticity or, more relevantly, constant shear from a source sample with a distribution of intrinsic properties (sizes, centroid positions, intrinsic shapes), an additional, now noise-dependent bias arises towards low S/N if incorrect priors for the intrinsic properties are used. We discuss the origin of this prior bias.Comment: 18 pages; 5 figures; accepted by A&A after major revision, especially of Sect. 3.3 that corrects the previous discussion on the bias by marginalizatio

    Improving PSF modelling for weak gravitational lensing using new methods in model selection

    Full text link
    A simple theoretical framework for the description and interpretation of spatially correlated modelling residuals is presented, and the resulting tools are found to provide a useful aid to model selection in the context of weak gravitational lensing. The description is focused upon the specific problem of modelling the spatial variation of a telescope point spread function (PSF) across the instrument field of view, a crucial stage in lensing data analysis, but the technique may be used to rank competing models wherever data are described empirically. As such it may, with further development, provide useful extra information when used in combination with existing model selection techniques such as the Akaike and Bayesian Information Criteria, or the Bayesian evidence. Two independent diagnostic correlation functions are described and the interpretation of these functions demonstrated using a simulated PSF anisotropy field. The efficacy of these diagnostic functions as an aid to the correct choice of empirical model is then demonstrated by analyzing results for a suite of Monte Carlo simulations of random PSF fields with varying degrees of spatial structure, and it is shown how the diagnostic functions can be related to requirements for precision cosmic shear measurement. The limitations of the technique, and opportunities for improvements and applications to fields other than weak gravitational lensing, are discussed.Comment: 18 pages, 12 figures. Modified to match version accepted for publication in MNRA

    Performance Evaluation of Channel Decoding With Deep Neural Networks

    Full text link
    With the demand of high data rate and low latency in fifth generation (5G), deep neural network decoder (NND) has become a promising candidate due to its capability of one-shot decoding and parallel computing. In this paper, three types of NND, i.e., multi-layer perceptron (MLP), convolution neural network (CNN) and recurrent neural network (RNN), are proposed with the same parameter magnitude. The performance of these deep neural networks are evaluated through extensive simulation. Numerical results show that RNN has the best decoding performance, yet at the price of the highest computational overhead. Moreover, we find there exists a saturation length for each type of neural network, which is caused by their restricted learning abilities.Comment: 6 pages, 11 figures, Latex; typos corrected; IEEE ICC 2018 to appea
    corecore