35 research outputs found
CASENet: Deep Category-Aware Semantic Edge Detection
Boundary and edge cues are highly beneficial in improving a wide variety of
vision tasks such as semantic segmentation, object recognition, stereo, and
object proposal generation. Recently, the problem of edge detection has been
revisited and significant progress has been made with deep learning. While
classical edge detection is a challenging binary problem in itself, the
category-aware semantic edge detection by nature is an even more challenging
multi-label problem. We model the problem such that each edge pixel can be
associated with more than one class as they appear in contours or junctions
belonging to two or more semantic classes. To this end, we propose a novel
end-to-end deep semantic edge learning architecture based on ResNet and a new
skip-layer architecture where category-wise edge activations at the top
convolution layer share and are fused with the same set of bottom layer
features. We then propose a multi-label loss function to supervise the fused
activations. We show that our proposed architecture benefits this problem with
better performance, and we outperform the current state-of-the-art semantic
edge detection methods by a large margin on standard data sets such as SBD and
Cityscapes.Comment: Accepted to CVPR 201
KCRC-LCD: Discriminative Kernel Collaborative Representation with Locality Constrained Dictionary for Visual Categorization
We consider the image classification problem via kernel collaborative
representation classification with locality constrained dictionary (KCRC-LCD).
Specifically, we propose a kernel collaborative representation classification
(KCRC) approach in which kernel method is used to improve the discrimination
ability of collaborative representation classification (CRC). We then measure
the similarities between the query and atoms in the global dictionary in order
to construct a locality constrained dictionary (LCD) for KCRC. In addition, we
discuss several similarity measure approaches in LCD and further present a
simple yet effective unified similarity measure whose superiority is validated
in experiments. There are several appealing aspects associated with LCD. First,
LCD can be nicely incorporated under the framework of KCRC. The LCD similarity
measure can be kernelized under KCRC, which theoretically links CRC and LCD
under the kernel method. Second, KCRC-LCD becomes more scalable to both the
training set size and the feature dimension. Example shows that KCRC is able to
perfectly classify data with certain distribution, while conventional CRC fails
completely. Comprehensive experiments on many public datasets also show that
KCRC-LCD is a robust discriminative classifier with both excellent performance
and good scalability, being comparable or outperforming many other
state-of-the-art approaches
Distributionally Robust Learning for Unsupervised Domain Adaptation
We propose a distributionally robust learning (DRL) method for unsupervised domain adaptation (UDA) that scales to modern computer vision benchmarks. DRL can be naturally formulated as a competitive two-player game between a predictor and an adversary that is allowed to corrupt the labels, subject to certain constraints, and reduces to incorporating a density ratio between the source and target domains (under the standard log loss). This formulation motivates the use of two neural networks that are jointly trained - a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network. The use of a density ratio in DRL prevents the model from being overconfident on target inputs far away from the source domain. Thus, DRL provides conservative confidence estimation in the target domain, even when the target labels are not available. This conservatism motivates the use of DRL in self-training for sample selection, and we term the approach distributionally robust self-training (DRST). In our experiments, DRST generates more calibrated probabilities and achieves state-of-the-art self-training accuracy on benchmark datasets. We demonstrate that DRST captures shape features more effectively, and reduces the extent of distributional shift during self-training
TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders
Enhancing the expressive capacity of deep learning-based time series models
with self-supervised pre-training has become ever-increasingly prevalent in
time series classification. Even though numerous efforts have been devoted to
developing self-supervised models for time series data, we argue that the
current methods are not sufficient to learn optimal time series representations
due to solely unidirectional encoding over sparse point-wise input units. In
this work, we propose TimeMAE, a novel self-supervised paradigm for learning
transferrable time series representations based on transformer networks. The
distinct characteristics of the TimeMAE lie in processing each time series into
a sequence of non-overlapping sub-series via window-slicing partitioning,
followed by random masking strategies over the semantic units of localized
sub-series. Such a simple yet effective setting can help us achieve the goal of
killing three birds with one stone, i.e., (1) learning enriched contextual
representations of time series with a bidirectional encoding scheme; (2)
increasing the information density of basic semantic units; (3) efficiently
encoding representations of time series using transformer networks.
Nevertheless, it is a non-trivial to perform reconstructing task over such a
novel formulated modeling paradigm. To solve the discrepancy issue incurred by
newly injected masked embeddings, we design a decoupled autoencoder
architecture, which learns the representations of visible (unmasked) positions
and masked ones with two different encoder modules, respectively. Furthermore,
we construct two types of informative targets to accomplish the corresponding
pretext tasks. One is to create a tokenizer module that assigns a codeword to
each masked region, allowing the masked codeword classification (MCC) task to
be completed effectively...Comment: Submitted to IEEE TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING(TKDE), under revie