3,633 research outputs found
Generalized Zero Shot Learning For Medical Image Classification
In many real world medical image classification settings we do not have
access to samples of all possible disease classes, while a robust system is
expected to give high performance in recognizing novel test data. We propose a
generalized zero shot learning (GZSL) method that uses self supervised learning
(SSL) for: 1) selecting anchor vectors of different disease classes; and 2)
training a feature generator. Our approach does not require class attribute
vectors which are available for natural images but not for medical images. SSL
ensures that the anchor vectors are representative of each class. SSL is also
used to generate synthetic features of unseen classes. Using a simpler
architecture, our method matches a state of the art SSL based GZSL method for
natural images and outperforms all methods for medical images. Our method is
adaptable enough to accommodate class attribute vectors when they are available
for natural images
TET-GAN: Text Effects Transfer via Stylization and Destylization
Text effects transfer technology automatically makes the text dramatically
more impressive. However, previous style transfer methods either study the
model for general style, which cannot handle the highly-structured text effects
along the glyph, or require manual design of subtle matching criteria for text
effects. In this paper, we focus on the use of the powerful representation
abilities of deep neural features for text effects transfer. For this purpose,
we propose a novel Texture Effects Transfer GAN (TET-GAN), which consists of a
stylization subnetwork and a destylization subnetwork. The key idea is to train
our network to accomplish both the objective of style transfer and style
removal, so that it can learn to disentangle and recombine the content and
style features of text effects images. To support the training of our network,
we propose a new text effects dataset with as much as 64 professionally
designed styles on 837 characters. We show that the disentangled feature
representations enable us to transfer or remove all these styles on arbitrary
glyphs using one network. Furthermore, the flexible network design empowers
TET-GAN to efficiently extend to a new text style via one-shot learning where
only one example is required. We demonstrate the superiority of the proposed
method in generating high-quality stylized text over the state-of-the-art
methods.Comment: Accepted by AAAI 2019. Code and dataset will be available at
http://www.icst.pku.edu.cn/struct/Projects/TETGAN.htm
Graceful Degradation and Related Fields
When machine learning models encounter data which is out of the distribution
on which they were trained they have a tendency to behave poorly, most
prominently over-confidence in erroneous predictions. Such behaviours will have
disastrous effects on real-world machine learning systems. In this field
graceful degradation refers to the optimisation of model performance as it
encounters this out-of-distribution data. This work presents a definition and
discussion of graceful degradation and where it can be applied in deployed
visual systems. Following this a survey of relevant areas is undertaken,
novelly splitting the graceful degradation problem into active and passive
approaches. In passive approaches, graceful degradation is handled and achieved
by the model in a self-contained manner, in active approaches the model is
updated upon encountering epistemic uncertainties. This work communicates the
importance of the problem and aims to prompt the development of machine
learning strategies that are aware of graceful degradation
See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data
Zero-shot point cloud segmentation aims to make deep models capable of
recognizing novel objects in point cloud that are unseen in the training phase.
Recent trends favor the pipeline which transfers knowledge from seen classes
with labels to unseen classes without labels. They typically align visual
features with semantic features obtained from word embedding by the supervision
of seen classes' annotations. However, point cloud contains limited information
to fully match with semantic features. In fact, the rich appearance information
of images is a natural complement to the textureless point cloud, which is not
well explored in previous literature. Motivated by this, we propose a novel
multi-modal zero-shot learning method to better utilize the complementary
information of point clouds and images for more accurate visual-semantic
alignment. Extensive experiments are performed in two popular benchmarks, i.e.,
SemanticKITTI and nuScenes, and our method outperforms current SOTA methods
with 52% and 49% improvement on average for unseen class mIoU, respectively.Comment: Accepted by ICCV 202
- …