Search CORE

199 research outputs found

Training a 3-node neural network is NP-complete

Author: Avrim L. Blum
Baum
Blum
Blum
Blum
Blum
Garey
Haussler
Judd
Judd
Kearns
Kearns
Megiddo
Raghavan
Ronald L. Rivest
Rumelhart
Sejnowski
Sontag
Tesauro
Valiant
Wigderson
Publication venue: 'Elsevier BV'
Publication date
Field of study

Incremental learning with respect to new incoming input attributes

Author: Guan SU
Li SC
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2001
Field of study

Neural networks are generally exposed to a dynamic environment where the training patterns or the input attributes (features) will likely be introduced into the current domain incrementally. This paper considers the situation where a new set of input attributes must be considered and added into the existing neural network. The conventional method is to discard the existing network and redesign one from scratch. This approach wastes the old knowledge and the previous effort. In order to reduce computational time, improve generalization accuracy, and enhance intelligence of the learned models, we present ILIA algorithms (namely ILIA1, ILIA2, ILIA3, ILIA4 and ILIA5) capable of Incremental Learning in terms of Input Attributes. Using the ILIA algorithms, when new input attributes are introduced into the original problem, the existing neural network can be retained and a new sub-network is constructed and trained incrementally. The new sub-network and the old one are merged later to form a new network for the changed problem. In addition, ILIA algorithms have the ability to decide whether the new incoming input attributes are relevant to the output and consistent with the existing input attributes or not and suggest to accept or reject them. Experimental results show that the ILIA algorithms are efficient and effective both for the classification and regression problems

CiteSeerX

Brunel University Research Archive

ScholarBank@NUS

BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning

Author: Wang Guanghui
Wu Yuanwei
Zhang Ziming
Publication venue
Publication date: 18/11/2017
Field of study

Understanding the global optimality in deep learning (DL) has been attracting more and more attention recently. Conventional DL solvers, however, have not been developed intentionally to seek for such global optimality. In this paper we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Our BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result it can adaptively determine the step size for current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. We prove that, by repeating such branch-and-pruning procedure, we can locate the global optimality within finite iterations. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation

arXiv.org e-Print Archive

Crossref

Elimination of All Bad Local Minima in Deep Learning

Author: Kaelbling Leslie Pack
Kawaguchi Kenji
Publication venue
Publication date: 15/01/2020
Field of study

In this paper, we theoretically prove that adding one special neuron per output unit eliminates all suboptimal local minima of any deep neural network, for multi-class classification, binary classification, and regression with an arbitrary loss function, under practical assumptions. At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network. The effects of the added neurons are proven to automatically vanish at every local minimum. Moreover, we provide a novel theoretical characterization of a failure mode of eliminating suboptimal local minima via an additional theorem and several examples. This paper also introduces a novel proof technique based on the perturbable gradient basis (PGB) necessary condition of local minima, which provides new insight into the elimination of local minima and is applicable to analyze various models and transformations of objective functions beyond the elimination of local minima.Comment: Accepted to appear in AISTATS 202

arXiv.org e-Print Archive

DSpace@MIT