Search CORE

7,345 research outputs found

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

Author: Mahoney Michael W.
Roosta-Khorasani Farbod
Xu Peng
Publication venue
Publication date: 25/08/2017
Field of study

While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in escaping flat regions and saddle points. These issues are particularly acute in highly non-convex settings such as those arising in neural networks. Motivated by this, there has been recent interest in second-order methods that aim to alleviate these shortcomings by capturing curvature information. In this paper, we report detailed empirical evaluations of a class of Newton-type methods, namely sub-sampled variants of trust region (TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex ML problems. In doing so, we demonstrate that these methods not only can be computationally competitive with hand-tuned SGD with momentum, obtaining comparable or better generalization performance, but also they are highly robust to hyper-parameter settings. Further, in contrast to SGD with momentum, we show that the manner in which these Newton-type methods employ curvature information allows them to seamlessly escape flat regions and saddle points.Comment: 21 pages, 11 figures. Restructure the paper and add experiment

arXiv.org e-Print Archive

University of Queensland eSpace

Reversible Recursive Instance-level Object Segmentation

Author: Feng Jiashi
Jie Zequn
Liang Xiaodan
Lin Liang
Shen Xiaohui
Wei Yunchao
Yan Shuicheng
Publication venue
Publication date: 18/11/2015
Field of study

In this work, we propose a novel Reversible Recursive Instance-level Object Segmentation (R2-IOS) framework to address the challenging instance-level object segmentation task. R2-IOS consists of a reversible proposal refinement sub-network that predicts bounding box offsets for refining the object proposal locations, and an instance-level segmentation sub-network that generates the foreground mask of the dominant object instance in each proposal. By being recursive, R2-IOS iteratively optimizes the two sub-networks during joint training, in which the refined object proposals and improved segmentation predictions are alternately fed into each other to progressively increase the network capabilities. By being reversible, the proposal refinement sub-network adaptively determines an optimal number of refinement iterations required for each proposal during both training and testing. Furthermore, to handle multiple overlapped instances within a proposal, an instance-aware denoising autoencoder is introduced into the segmentation sub-network to distinguish the dominant object from other distracting instances. Extensive experiments on the challenging PASCAL VOC 2012 benchmark well demonstrate the superiority of R2-IOS over other state-of-the-art methods. In particular, the

\text{AP}^r

over

20

classes at

0.5

IoU achieves

66.7\%

, which significantly outperforms the results of

58.7\%

by PFN~\cite{PFN} and

46.3\%

by~\cite{liu2015multi}.Comment: 9 page

arXiv.org e-Print Archive

Crossref