Search CORE

4,479 research outputs found

Online Adaptive Methods, Universality and Acceleration

Author: Cevher Volkan
Levy Kfir Y.
Yurtsever Alp
Publication venue
Publication date: 04/07/2018
Field of study

We present a novel method for convex unconstrained optimization that, without any modifications, ensures: (i) accelerated convergence rate for smooth objectives, (ii) standard convergence rate in the general (non-smooth) setting, and (iii) standard convergence rate in the stochastic optimization setting. To the best of our knowledge, this is the first method that simultaneously applies to all of the above settings. At the heart of our method is an adaptive learning rate rule that employs importance weights, in the spirit of adaptive online learning algorithms, combined with an update that linearly couples two sequences. An empirical examination of our method demonstrates its applicability to the above mentioned scenarios and corroborates our theoretical findings

Infoscience - École polytechnique fédérale de Lausanne

Artificial Neural Network Pruning to Extract Knowledge

Author: Mirkes Evgeny M
Publication venue
Publication date: 13/05/2020
Field of study

Artificial Neural Networks (NN) are widely used for solving complex problems from medical diagnostics to face recognition. Despite notable successes, the main disadvantages of NN are also well known: the risk of overfitting, lack of explainability (inability to extract algorithms from trained NN), and high consumption of computing resources. Determining the appropriate specific NN structure for each problem can help overcome these difficulties: Too poor NN cannot be successfully trained, but too rich NN gives unexplainable results and may have a high chance of overfitting. Reducing precision of NN parameters simplifies the implementation of these NN, saves computing resources, and makes the NN skills more transparent. This paper lists the basic NN simplification problems and controlled pruning procedures to solve these problems. All the described pruning procedures can be implemented in one framework. The developed procedures, in particular, find the optimal structure of NN for each task, measure the influence of each input signal and NN parameter, and provide a detailed verbal description of the algorithms and skills of NN. The described methods are illustrated by a simple example: the generation of explicit algorithms for predicting the results of the US presidential election.Comment: IJCNN 202

arXiv.org e-Print Archive