17 research outputs found
Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization
Stochastic Gradient Descent (SGD) has played a central role in machine
learning. However, it requires a carefully hand-picked stepsize for fast
convergence, which is notoriously tedious and time-consuming to tune. Over the
last several years, a plethora of adaptive gradient-based algorithms have
emerged to ameliorate this problem. They have proved efficient in reducing the
labor of tuning in practice, but many of them lack theoretic guarantees even in
the convex setting. In this paper, we propose new surrogate losses to cast the
problem of learning the optimal stepsizes for the stochastic optimization of a
non-convex smooth objective function onto an online convex optimization
problem. This allows the use of no-regret online algorithms to compute optimal
stepsizes on the fly. In turn, this results in a SGD algorithm with self-tuned
stepsizes that guarantees convergence rates that are automatically adaptive to
the level of noise
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Off-Policy Actor-Critic (Off-PAC) methods have proven successful in a variety
of continuous control tasks. Normally, the critic's action-value function is
updated using temporal-difference, and the critic in turn provides a loss for
the actor that trains it to take actions with higher expected return. In this
paper, we introduce a novel and flexible meta-critic that observes the learning
process and meta-learns an additional loss for the actor that accelerates and
improves actor-critic learning. Compared to the vanilla critic, the meta-critic
network is explicitly trained to accelerate the learning process; and compared
to existing meta-learning algorithms, meta-critic is rapidly learned online for
a single task, rather than slowly over a family of tasks. Crucially, our
meta-critic framework is designed for off-policy based learners, which
currently provide state-of-the-art reinforcement learning sample efficiency. We
demonstrate that online meta-critic learning leads to improvements in avariety
of continuous control environments when combined with contemporary Off-PAC
methods DDPG, TD3 and the state-of-the-art SAC.Comment: NeurIPS 202
Stateless actor-critic for instance segmentation with high-level priors
Instance segmentation is an important computer vision problem which remains
challenging despite impressive recent advances due to deep learning-based
methods. Given sufficient training data, fully supervised methods can yield
excellent performance, but annotation of ground-truth data remains a major
bottleneck, especially for biomedical applications where it has to be performed
by domain experts. The amount of labels required can be drastically reduced by
using rules derived from prior knowledge to guide the segmentation. However,
these rules are in general not differentiable and thus cannot be used with
existing methods. Here, we relax this requirement by using stateless actor
critic reinforcement learning, which enables non-differentiable rewards. We
formulate the instance segmentation problem as graph partitioning and the actor
critic predicts the edge weights driven by the rewards, which are based on the
conformity of segmented instances to high-level priors on object shape,
position or size. The experiments on toy and real datasets demonstrate that we
can achieve excellent performance without any direct supervision based only on
a rich set of priors
Gradient-based Bi-level Optimization for Deep Learning: A Survey
Bi-level optimization, especially the gradient-based category, has been
widely used in the deep learning community including hyperparameter
optimization and meta-knowledge extraction. Bi-level optimization embeds one
problem within another and the gradient-based category solves the outer-level
task by computing the hypergradient, which is much more efficient than
classical methods such as the evolutionary algorithm. In this survey, we first
give a formal definition of the gradient-based bi-level optimization. Next, we
delineate criteria to determine if a research problem is apt for bi-level
optimization and provide a practical guide on structuring such problems into a
bi-level optimization framework, a feature particularly beneficial for those
new to this domain. More specifically, there are two formulations: the
single-task formulation to optimize hyperparameters such as regularization
parameters and the distilled data, and the multi-task formulation to extract
meta-knowledge such as the model initialization. With a bi-level formulation,
we then discuss four bi-level optimization solvers to update the outer variable
including explicit gradient update, proxy update, implicit function update, and
closed-form update. Finally, we wrap up the survey by highlighting two
prospective future directions: (1) Effective Data Optimization for Science
examined through the lens of task formulation. (2) Accurate Explicit Proxy
Update analyzed from an optimization standpoint.Comment: AI4Science; Bi-level Optimization; Hyperparameter Optimization; Meta
Learning; Implicit Functio
Adaptive partial scanning transmission electron microscopy with reinforcement learning
Compressed sensing can decrease scanning transmission electron microscopy electron dose and scan time with minimal information loss. Traditionally, sparse scans used in compressed sensing sample a static set of probing locations. In contrast, we present a prototype for a contiguous sparse scan system that piecewise adapts scan paths to specimens as they are scanned. Sampling directions for scan segments are chosen by a recurrent neural network based on previously observed scan segments. The recurrent actor is trained by reinforcement learning to cooperate with a feedforward convolutional neural network that completes sparse scans. This paper presents our learning policy, experiments, and example partial scans, and discusses future research directions. Source code, pretrained models, and training data is openly accessible at https://github.com/Jeffrey-Ede/adaptive-scans
Automatic Designs in Deep Neural Networks
To train a Deep Neural Network (DNN) that performs well for a task, many design steps are taken including data designs, model designs and loss designs. Despite that remarkable progress has been made in all these domains of designing DNNs, the unexplored design space of each component is still vast. That brings the research field of developing automated techniques to lift some heavy work from human researchers when exploring the design space.
The automated designs can help human researchers to make massive or challenging design choices and reduce the expertise required from human researchers.
Much effort has been made towards automated designs of DNNs, including synthetic data generation, automated data augmentation, neural architecture search and so on.
Despite the huge effort, the automation of DNN designs is still far from complete.
This thesis contributes in two ways: identifying new problems in the DNN design pipeline that can be solved automatically, and proposing new solutions to problems that have been explored by automated designs.
The first part of this thesis presents two problems that were usually solved with manual designs but can benefit from automated designs. To tackle the problem of inefficient computation due to using a static DNN architecture for different inputs, some manual efforts have been made to use different networks for different inputs as needed, such as cascade models. We propose an automated dynamic inference framework that can cut this manual effort and automatically choose different architectures for different inputs during inference.
To tackle the problem of designing differentiable loss functions for non-differentiable performance metrics, researchers usually design the loss manually for each individual task.
We propose an unified loss framework that reduces the amount of manual design of losses in different tasks.
The second part of this thesis discusses developing new techniques in domains where the automated design has been shown effective.
In the synthetic data generation domain, we propose a novel method to automatically generate synthetic data for small-data object detection. The synthetic data generated can amend the limited annotated real data of the small-data object detection tasks, such as rare disease detection.
In the architecture search domain, we propose an architecture search method customized for generative adversarial networks (GANs). GANs are commonly known unstable to train where we propose this new method that can stabilize the training of GANs in the architecture search process.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163208/1/llanlan_1.pd