3,419 research outputs found

    Generalization Error in Deep Learning

    Get PDF
    Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results

    Structural Parameters, Tight Bounds, and Approximation for (k,r)-Center

    Get PDF
    In (k,r)-Center we are given a (possibly edge-weighted) graph and are asked to select at most k vertices (centers), so that all other vertices are at distance at most r from a center. In this paper we provide a number of tight fine-grained bounds on the complexity of this problem with respect to various standard graph parameters. Specifically: - For any r>=1, we show an algorithm that solves the problem in O*((3r+1)^cw) time, where cw is the clique-width of the input graph, as well as a tight SETH lower bound matching this algorithm\u27s performance. As a corollary, for r=1, this closes the gap that previously existed on the complexity of Dominating Set parameterized by cw. - We strengthen previously known FPT lower bounds, by showing that (k,r)-Center is W[1]-hard parameterized by the input graph\u27s vertex cover (if edge weights are allowed), or feedback vertex set, even if k is an additional parameter. Our reductions imply tight ETH-based lower bounds. Finally, we devise an algorithm parameterized by vertex cover for unweighted graphs. - We show that the complexity of the problem parameterized by tree-depth is 2^Theta(td^2) by showing an algorithm of this complexity and a tight ETH-based lower bound. We complement these mostly negative results by providing FPT approximation schemes parameterized by clique-width or treewidth which work efficiently independently of the values of k,r. In particular, we give algorithms which, for any epsilon>0, run in time O*((tw/epsilon)^O(tw)), O*((cw/epsilon)^O(cw)) and return a (k,(1+epsilon)r)-center, if a (k,r)-center exists, thus circumventing the problem\u27s W-hardness

    On Some Integrated Approaches to Inference

    Full text link
    We present arguments for the formulation of unified approach to different standard continuous inference methods from partial information. It is claimed that an explicit partition of information into a priori (prior knowledge) and a posteriori information (data) is an important way of standardizing inference approaches so that they can be compared on a normative scale, and so that notions of optimal algorithms become farther-reaching. The inference methods considered include neural network approaches, information-based complexity, and Monte Carlo, spline, and regularization methods. The model is an extension of currently used continuous complexity models, with a class of algorithms in the form of optimization methods, in which an optimization functional (involving the data) is minimized. This extends the family of current approaches in continuous complexity theory, which include the use of interpolatory algorithms in worst and average case settings
    • …
    corecore