30,959 research outputs found
Conditional Random Fields and Support Vector Machines: A Hybrid Approach
We propose a novel hybrid loss for multiclass and structured prediction
problems that is a convex combination of log loss for Conditional Random Fields
(CRFs) and a multiclass hinge loss for Support Vector Machines (SVMs). We
provide a sufficient condition for when the hybrid loss is Fisher consistent
for classification. This condition depends on a measure of dominance between
labels - specifically, the gap in per observation probabilities between the
most likely labels. We also prove Fisher consistency is necessary for
parametric consistency when learning models such as CRFs.
We demonstrate empirically that the hybrid loss typically performs as least
as well as - and often better than - both of its constituent losses on variety
of tasks. In doing so we also provide an empirical comparison of the efficacy
of probabilistic and margin based approaches to multiclass and structured
prediction and the effects of label dominance on these results.Comment: 16 pages, 3 figure
A Hybrid Loss for Multiclass and Structured Prediction
We propose a novel hybrid loss for multiclass and structured prediction
problems that is a convex combination of a log loss for Conditional Random
Fields (CRFs) and a multiclass hinge loss for Support Vector Machines (SVMs).
We provide a sufficient condition for when the hybrid loss is Fisher consistent
for classification. This condition depends on a measure of dominance between
labels--specifically, the gap between the probabilities of the best label and
the second best label. We also prove Fisher consistency is necessary for
parametric consistency when learning models such as CRFs. We demonstrate
empirically that the hybrid loss typically performs least as well as--and often
better than--both of its constituent losses on a variety of tasks, such as
human action recognition. In doing so we also provide an empirical comparison
of the efficacy of probabilistic and margin based approaches to multiclass and
structured prediction.Comment: 12 pages, 5 figures. arXiv admin note: substantial text overlap with
arXiv:1009.334
Complementary-Label Learning for Arbitrary Losses and Models
In contrast to the standard classification paradigm where the true class is
given to each training pattern, complementary-label learning only uses training
patterns each equipped with a complementary label, which only specifies one of
the classes that the pattern does not belong to. The goal of this paper is to
derive a novel framework of complementary-label learning with an unbiased
estimator of the classification risk, for arbitrary losses and models---all
existing methods have failed to achieve this goal. Not only is this beneficial
for the learning stage, it also makes model/hyper-parameter selection (through
cross-validation) possible without the need of any ordinarily labeled
validation data, while using any linear/non-linear models or convex/non-convex
loss functions. We further improve the risk estimator by a non-negative
correction and gradient ascent trick, and demonstrate its superiority through
experiments.Comment: accepted to ICML 2019 (Added errata on Nov. 19, 2019
Boosting in the presence of outliers: adaptive classification with non-convex loss functions
This paper examines the role and efficiency of the non-convex loss functions
for binary classification problems. In particular, we investigate how to design
a simple and effective boosting algorithm that is robust to the outliers in the
data. The analysis of the role of a particular non-convex loss for prediction
accuracy varies depending on the diminishing tail properties of the gradient of
the loss -- the ability of the loss to efficiently adapt to the outlying data,
the local convex properties of the loss and the proportion of the contaminated
data. In order to use these properties efficiently, we propose a new family of
non-convex losses named -robust losses. Moreover, we present a new
boosting framework, {\it Arch Boost}, designed for augmenting the existing work
such that its corresponding classification algorithm is significantly more
adaptable to the unknown data contamination. Along with the Arch Boosting
framework, the non-convex losses lead to the new class of boosting algorithms,
named adaptive, robust, boosting (ARB). Furthermore, we present theoretical
examples that demonstrate the robustness properties of the proposed algorithms.
In particular, we develop a new breakdown point analysis and a new influence
function analysis that demonstrate gains in robustness. Moreover, we present
new theoretical results, based only on local curvatures, which may be used to
establish statistical and optimization properties of the proposed Arch boosting
algorithms with highly non-convex loss functions. Extensive numerical
calculations are used to illustrate these theoretical properties and reveal
advantages over the existing boosting methods when data exhibits a number of
outliers
Collaborative Learning for Weakly Supervised Object Detection
Weakly supervised object detection has recently received much attention,
since it only requires image-level labels instead of the bounding-box labels
consumed in strongly supervised learning. Nevertheless, the save in labeling
expense is usually at the cost of model accuracy. In this paper, we propose a
simple but effective weakly supervised collaborative learning framework to
resolve this problem, which trains a weakly supervised learner and a strongly
supervised learner jointly by enforcing partial feature sharing and prediction
consistency. For object detection, taking WSDDN-like architecture as weakly
supervised detector sub-network and Faster-RCNN-like architecture as strongly
supervised detector sub-network, we propose an end-to-end Weakly Supervised
Collaborative Detection Network. As there is no strong supervision available to
train the Faster-RCNN-like sub-network, a new prediction consistency loss is
defined to enforce consistency of predictions between the two sub-networks as
well as within the Faster-RCNN-like sub-networks. At the same time, the two
detectors are designed to partially share features to further guarantee the
model consistency at perceptual level. Extensive experiments on PASCAL VOC 2007
and 2012 data sets have demonstrated the effectiveness of the proposed
framework
CyCADA: Cycle-Consistent Adversarial Domain Adaptation
Domain adaptation is critical for success in new, unseen environments.
Adversarial adaptation models applied in feature spaces discover domain
invariant representations, but are difficult to visualize and sometimes fail to
capture pixel-level and low-level domain shifts. Recent work has shown that
generative adversarial networks combined with cycle-consistency constraints are
surprisingly effective at mapping images between domains, even without the use
of aligned image pairs. We propose a novel discriminatively-trained
Cycle-Consistent Adversarial Domain Adaptation model. CyCADA adapts
representations at both the pixel-level and feature-level, enforces
cycle-consistency while leveraging a task loss, and does not require aligned
pairs. Our model can be applied in a variety of visual recognition and
prediction settings. We show new state-of-the-art results across multiple
adaptation tasks, including digit classification and semantic segmentation of
road scenes demonstrating transfer from synthetic to real world domains
Consistent Multilabel Ranking through Univariate Losses
We consider the problem of rank loss minimization in the setting of
multilabel classification, which is usually tackled by means of convex
surrogate losses defined on pairs of labels. Very recently, this approach was
put into question by a negative result showing that commonly used pairwise
surrogate losses, such as exponential and logistic losses, are inconsistent. In
this paper, we show a positive result which is arguably surprising in light of
the previous one: the simpler univariate variants of exponential and logistic
surrogates (i.e., defined on single labels) are consistent for rank loss
minimization. Instead of directly proving convergence, we give a much stronger
result by deriving regret bounds and convergence rates. The proposed losses
suggest efficient and scalable algorithms, which are tested experimentally.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Robust and Efficient Boosting Method using the Conditional Risk
Well-known for its simplicity and effectiveness in classification, AdaBoost,
however, suffers from overfitting when class-conditional distributions have
significant overlap. Moreover, it is very sensitive to noise that appears in
the labels. This article tackles the above limitations simultaneously via
optimizing a modified loss function (i.e., the conditional risk). The proposed
approach has the following two advantages. (1) It is able to directly take into
account label uncertainty with an associated label confidence. (2) It
introduces a "trustworthiness" measure on training samples via the Bayesian
risk rule, and hence the resulting classifier tends to have finite sample
performance that is superior to that of the original AdaBoost when there is a
large overlap between class conditional distributions. Theoretical properties
of the proposed method are investigated. Extensive experimental results using
synthetic data and real-world data sets from UCI machine learning repository
are provided. The empirical study shows the high competitiveness of the
proposed method in predication accuracy and robustness when compared with the
original AdaBoost and several existing robust AdaBoost algorithms.Comment: 14 Pages, 2 figures and 5 table
Fast Weakly Supervised Action Segmentation Using Mutual Consistency
Action segmentation is the task of predicting the actions for each frame of a
video. As obtaining the full annotation of videos for action segmentation is
expensive, weakly supervised approaches that can learn only from transcripts
are appealing. In this paper, we propose a novel end-to-end approach for weakly
supervised action segmentation based on a two-branch neural network. The two
branches of our network predict two redundant but different representations for
action segmentation and we propose a novel mutual consistency (MuCon) loss that
enforces the consistency of the two redundant representations. Using the MuCon
loss together with a loss for transcript prediction, our proposed approach
achieves the accuracy of state-of-the-art approaches while being times
faster to train and times faster during inference. The MuCon loss proves
beneficial even in the fully supervised setting.Comment: Accepted for publication at TPAMI (IEEE Transactions on Pattern
Analysis and Machine Intelligence) in 2021. First two authors contributed
equall
Temporal Cycle-Consistency Learning
We introduce a self-supervised representation learning method based on the
task of temporal alignment between videos. The method trains a network using
temporal cycle consistency (TCC), a differentiable cycle-consistency loss that
can be used to find correspondences across time in multiple videos. The
resulting per-frame embeddings can be used to align videos by simply matching
frames using the nearest-neighbors in the learned embedding space.
To evaluate the power of the embeddings, we densely label the Pouring and
Penn Action video datasets for action phases. We show that (i) the learned
embeddings enable few-shot classification of these action phases, significantly
reducing the supervised training requirements; and (ii) TCC is complementary to
other methods of self-supervised learning in videos, such as Shuffle and Learn
and Time-Contrastive Networks. The embeddings are also used for a number of
applications based on alignment (dense temporal correspondence) between video
pairs, including transfer of metadata of synchronized modalities between videos
(sounds, temporal semantic labels), synchronized playback of multiple videos,
and anomaly detection. Project webpage:
https://sites.google.com/view/temporal-cycle-consistency .Comment: Accepted at CVPR 2019. Project webpage:
https://sites.google.com/view/temporal-cycle-consistenc
- …