789 research outputs found
Self-Paced Multi-Task Learning
In this paper, we propose a novel multi-task learning (MTL) framework, called
Self-Paced Multi-Task Learning (SPMTL). Different from previous works treating
all tasks and instances equally when training, SPMTL attempts to jointly learn
the tasks by taking into consideration the complexities of both tasks and
instances. This is inspired by the cognitive process of human brain that often
learns from the easy to the hard. We construct a compact SPMTL formulation by
proposing a new task-oriented regularizer that can jointly prioritize the tasks
and the instances. Thus it can be interpreted as a self-paced learner for MTL.
A simple yet effective algorithm is designed for optimizing the proposed
objective function. An error bound for a simplified formulation is also
analyzed theoretically. Experimental results on toy and real-world datasets
demonstrate the effectiveness of the proposed approach, compared to the
state-of-the-art methods
Data Optimization in Deep Learning: A Survey
Large-scale, high-quality data are considered an essential factor for the
successful application of many deep learning techniques. Meanwhile, numerous
real-world deep learning tasks still have to contend with the lack of
sufficient amounts of high-quality data. Additionally, issues such as model
robustness, fairness, and trustworthiness are also closely related to training
data. Consequently, a huge number of studies in the existing literature have
focused on the data aspect in deep learning tasks. Some typical data
optimization techniques include data augmentation, logit perturbation, sample
weighting, and data condensation. These techniques usually come from different
deep learning divisions and their theoretical inspirations or heuristic
motivations may seem unrelated to each other. This study aims to organize a
wide range of existing data optimization methodologies for deep learning from
the previous literature, and makes the effort to construct a comprehensive
taxonomy for them. The constructed taxonomy considers the diversity of split
dimensions, and deep sub-taxonomies are constructed for each dimension. On the
basis of the taxonomy, connections among the extensive data optimization
methods for deep learning are built in terms of four aspects. We probe into
rendering several promising and interesting future directions. The constructed
taxonomy and the revealed connections will enlighten the better understanding
of existing methods and the design of novel data optimization techniques.
Furthermore, our aspiration for this survey is to promote data optimization as
an independent subdivision of deep learning. A curated, up-to-date list of
resources related to data optimization in deep learning is available at
\url{https://github.com/YaoRujing/Data-Optimization}
GAGA: Deciphering Age-path of Generalized Self-paced Regularizer
Nowadays self-paced learning (SPL) is an important machine learning paradigm
that mimics the cognitive process of humans and animals. The SPL regime
involves a self-paced regularizer and a gradually increasing age parameter,
which plays a key role in SPL but where to optimally terminate this process is
still non-trivial to determine. A natural idea is to compute the solution path
w.r.t. age parameter (i.e., age-path). However, current age-path algorithms are
either limited to the simplest regularizer, or lack solid theoretical
understanding as well as computational efficiency. To address this challenge,
we propose a novel \underline{G}eneralized \underline{Ag}e-path
\underline{A}lgorithm (GAGA) for SPL with various self-paced regularizers based
on ordinary differential equations (ODEs) and sets control, which can learn the
entire solution spectrum w.r.t. a range of age parameters. To the best of our
knowledge, GAGA is the first exact path-following algorithm tackling the
age-path for general self-paced regularizer. Finally the algorithmic steps of
classic SVM and Lasso are described in detail. We demonstrate the performance
of GAGA on real-world datasets, and find considerable speedup between our
algorithm and competing baselines.Comment: 33 pages. Published as a conference paper at NeurIPS 202
- …