Search CORE

3,419 research outputs found

Generalization Error in Deep Learning

Author: D McAllester
D Vainsencher
DA McAllester
Daniel Jakubovitz
Huan Xu
J Bruna
J Sokolic
K Schnass
M Anthony
N Akhtar
PL Bartlett
PL Bartlett
R Gribonval
R Gribonval
S Shalev-Shwartz
SJ Pan
TM Cover
V Papyan
Publication venue
Publication date: 06/04/2019
Field of study

Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results

arXiv.org e-Print Archive

Crossref

UCL Discovery

Structural Parameters, Tight Bounds, and Approximation for (k,r)-Center

Author: Katsikarelis Ioannis
Lampis Michael
Paschos Vangelis Th.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 28th International Symposium on Algorithms and Computation (ISAAC 2017)
Publication date: 01/01/2017
Field of study

In (k,r)-Center we are given a (possibly edge-weighted) graph and are asked to select at most k vertices (centers), so that all other vertices are at distance at most r from a center. In this paper we provide a number of tight fine-grained bounds on the complexity of this problem with respect to various standard graph parameters. Specifically: - For any r>=1, we show an algorithm that solves the problem in O*((3r+1)^cw) time, where cw is the clique-width of the input graph, as well as a tight SETH lower bound matching this algorithm\u27s performance. As a corollary, for r=1, this closes the gap that previously existed on the complexity of Dominating Set parameterized by cw. - We strengthen previously known FPT lower bounds, by showing that (k,r)-Center is W[1]-hard parameterized by the input graph\u27s vertex cover (if edge weights are allowed), or feedback vertex set, even if k is an additional parameter. Our reductions imply tight ETH-based lower bounds. Finally, we devise an algorithm parameterized by vertex cover for unweighted graphs. - We show that the complexity of the problem parameterized by tree-depth is 2^Theta(td^2) by showing an algorithm of this complexity and a tight ETH-based lower bound. We complement these mostly negative results by providing FPT approximation schemes parameterized by clique-width or treewidth which work efficiently independently of the values of k,r. In particular, we give algorithms which, for any epsilon>0, run in time O*((tw/epsilon)^O(tw)), O*((cw/epsilon)^O(cw)) and return a (k,(1+epsilon)r)-center, if a (k,r)-center exists, thus circumventing the problem\u27s W-hardness

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On Some Integrated Approaches to Inference

Author: Kon Mark A.
Plaskota Leszek
Publication venue
Publication date: 01/01/2012
Field of study

We present arguments for the formulation of unified approach to different standard continuous inference methods from partial information. It is claimed that an explicit partition of information into a priori (prior knowledge) and a posteriori information (data) is an important way of standardizing inference approaches so that they can be compared on a normative scale, and so that notions of optimal algorithms become farther-reaching. The inference methods considered include neural network approaches, information-based complexity, and Monte Carlo, spline, and regularization methods. The model is an extension of currently used continuous complexity models, with a class of algorithms in the form of optimization methods, in which an optimization functional (involving the data) is minimized. This extends the family of current approaches in continuous complexity theory, which include the use of interpolatory algorithms in worst and average case settings

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)