636 research outputs found
A Robust Adaptive Stochastic Gradient Method for Deep Learning
Stochastic gradient algorithms are the main focus of large-scale optimization
problems and led to important successes in the recent advancement of the deep
learning algorithms. The convergence of SGD depends on the careful choice of
learning rate and the amount of the noise in stochastic estimates of the
gradients. In this paper, we propose an adaptive learning rate algorithm, which
utilizes stochastic curvature information of the loss function for
automatically tuning the learning rates. The information about the element-wise
curvature of the loss function is estimated from the local statistics of the
stochastic first order gradients. We further propose a new variance reduction
technique to speed up the convergence. In our experiments with deep neural
networks, we obtained better performance compared to the popular stochastic
gradient algorithms.Comment: IJCNN 2017 Accepted Paper, An extension of our paper, "ADASECANT:
Robust Adaptive Secant Method for Stochastic Gradient
Optimistic Robust Optimization With Applications To Machine Learning
Robust Optimization has traditionally taken a pessimistic, or worst-case
viewpoint of uncertainty which is motivated by a desire to find sets of optimal
policies that maintain feasibility under a variety of operating conditions. In
this paper, we explore an optimistic, or best-case view of uncertainty and show
that it can be a fruitful approach. We show that these techniques can be used
to address a wide variety of problems. First, we apply our methods in the
context of robust linear programming, providing a method for reducing
conservatism in intuitive ways that encode economically realistic modeling
assumptions. Second, we look at problems in machine learning and find that this
approach is strongly connected to the existing literature. Specifically, we
provide a new interpretation for popular sparsity inducing non-convex
regularization schemes. Additionally, we show that successful approaches for
dealing with outliers and noise can be interpreted as optimistic robust
optimization problems. Although many of the problems resulting from our
approach are non-convex, we find that DCA or DCA-like optimization approaches
can be intuitive and efficient
Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport
Optimal Transport (OT) problem investigates a transport map that bridges two
distributions while minimizing a given cost function. In this regard, OT
between tractable prior distribution and data has been utilized for generative
modeling tasks. However, OT-based methods are susceptible to outliers and face
optimization challenges during training. In this paper, we propose a novel
generative model based on the semi-dual formulation of Unbalanced Optimal
Transport (UOT). Unlike OT, UOT relaxes the hard constraint on distribution
matching. This approach provides better robustness against outliers, stability
during training, and faster convergence. We validate these properties
empirically through experiments. Moreover, we study the theoretical upper-bound
of divergence between distributions in UOT. Our model outperforms existing
OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 5.80
on CelebA-HQ-256.Comment: 23 pages, 15 figure
Optimal Pose and Shape Estimation for Category-level 3D Object Perception
We consider a category-level perception problem, where one is given 3D sensor
data picturing an object of a given category (e.g. a car), and has to
reconstruct the pose and shape of the object despite intra-class variability
(i.e. different car models have different shapes). We consider an active shape
model, where -- for an object category -- we are given a library of potential
CAD models describing objects in that category, and we adopt a standard
formulation where pose and shape estimation are formulated as a non-convex
optimization. Our first contribution is to provide the first certifiably
optimal solver for pose and shape estimation. In particular, we show that
rotation estimation can be decoupled from the estimation of the object
translation and shape, and we demonstrate that (i) the optimal object rotation
can be computed via a tight (small-size) semidefinite relaxation, and (ii) the
translation and shape parameters can be computed in closed-form given the
rotation. Our second contribution is to add an outlier rejection layer to our
solver, hence making it robust to a large number of misdetections. Towards this
goal, we wrap our optimal solver in a robust estimation scheme based on
graduated non-convexity. To further enhance robustness to outliers, we also
develop the first graph-theoretic formulation to prune outliers in
category-level perception, which removes outliers via convex hull and maximum
clique computations; the resulting approach is robust to 70%-90% outliers. Our
third contribution is an extensive experimental evaluation. Besides providing
an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we
combine our solver with a deep-learned keypoint detector, and show that the
resulting approach improves over the state of the art in vehicle pose
estimation in the ApolloScape datasets
- …