647 research outputs found
Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies
Various approaches have been proposed to learn visuo-motor policies for
real-world robotic applications. One solution is first learning in simulation
then transferring to the real world. In the transfer, most existing approaches
need real-world images with labels. However, the labelling process is often
expensive or even impractical in many robotic applications. In this paper, we
propose an adversarial discriminative sim-to-real transfer approach to reduce
the cost of labelling real data. The effectiveness of the approach is
demonstrated with modular networks in a table-top object reaching task where a
7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter
through visual observations. The adversarial transfer approach reduced the
labelled real data requirement by 50%. Policies can be transferred to real
environments with only 93 labelled and 186 unlabelled real images. The
transferred visuo-motor policies are robust to novel (not seen in training)
objects in clutter and even a moving target, achieving a 97.8% success rate and
1.8 cm control accuracy.Comment: Under review for the International Journal of Robotics Researc
Graceful Degradation and Related Fields
When machine learning models encounter data which is out of the distribution
on which they were trained they have a tendency to behave poorly, most
prominently over-confidence in erroneous predictions. Such behaviours will have
disastrous effects on real-world machine learning systems. In this field
graceful degradation refers to the optimisation of model performance as it
encounters this out-of-distribution data. This work presents a definition and
discussion of graceful degradation and where it can be applied in deployed
visual systems. Following this a survey of relevant areas is undertaken,
novelly splitting the graceful degradation problem into active and passive
approaches. In passive approaches, graceful degradation is handled and achieved
by the model in a self-contained manner, in active approaches the model is
updated upon encountering epistemic uncertainties. This work communicates the
importance of the problem and aims to prompt the development of machine
learning strategies that are aware of graceful degradation
Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification
Zero-shot learning strives to classify unseen categories for which no data is
available during training. In the generalized variant, the test samples can
further belong to seen or unseen categories. The state-of-the-art relies on
Generative Adversarial Networks that synthesize unseen class features by
leveraging class-specific semantic embeddings. During training, they generate
semantically consistent features, but discard this constraint during feature
synthesis and classification. We propose to enforce semantic consistency at all
stages of (generalized) zero-shot learning: training, feature synthesis and
classification. We first introduce a feedback loop, from a semantic embedding
decoder, that iteratively refines the generated features during both the
training and feature synthesis stages. The synthesized features together with
their corresponding latent embeddings from the decoder are then transformed
into discriminative features and utilized during classification to reduce
ambiguities among categories. Experiments on (generalized) zero-shot object and
action classification reveal the benefit of semantic consistency and iterative
feedback, outperforming existing methods on six zero-shot learning benchmarks.
Source code at https://github.com/akshitac8/tfvaegan.Comment: Accepted for publication at ECCV 202
Enhancing deep transfer learning for image classification
Though deep learning models require a large amount of labelled training data for yielding high performance, they are applied to accomplish many computer vision tasks such as image classification. Current models also do not perform well across different domain settings such as illumination, camera angle and real-to-synthetic. Thus the models are more likely to misclassify unknown classes as known classes. These issues challenge the supervised learning paradigm of the models and encourage the study of transfer learning approaches. Transfer learning allows us to utilise the knowledge acquired from related domains to improve performance on a target domain. Existing transfer learning approaches lack proper high-level source domain feature analyses and are prone to negative transfers for not exploring proper discriminative information across domains. Current approaches also lack at discovering necessary visual-semantic linkage and has a bias towards the source domain. In this thesis, to address these issues and improve image classification performance, we make several contributions to three different deep transfer learning scenarios, i.e., the target domain has i) labelled data; no labelled data; and no visual data. Firstly, for improving inductive transfer learning for the first scenario, we analyse the importance of high-level deep features and propose utilising them in sequential transfer learning approaches and investigating the suitable conditions for optimal performance. Secondly, to improve image classification across different domains in an open set setting by reducing negative transfers (second scenario), we propose two novel architectures. The first model has an adaptive weighting module based on underlying domain distinctive information, and the second model has an information-theoretic weighting module to reduce negative transfers. Thirdly, to learn visual classifiers when no visual data is available (third scenario) and reduce source domain bias, we propose two novel models. One model has a new two-step dense attention mechanism to discover semantic attribute-guided local visual features and mutual learning loss. The other model utilises bidirectional mapping and adversarial supervision to learn the joint distribution of source-target domains simultaneously. We propose a new pointwise mutual information dependant loss in the first model and a distance-based loss in the second one for handling source domain bias. We perform extensive evaluations on benchmark datasets and demonstrate the proposed models outperform contemporary works.Doctor of Philosoph
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Towards Practicality of Sketch-Based Visual Understanding
Sketches have been used to conceptualise and depict visual objects from
pre-historic times. Sketch research has flourished in the past decade,
particularly with the proliferation of touchscreen devices. Much of the
utilisation of sketch has been anchored around the fact that it can be used to
delineate visual concepts universally irrespective of age, race, language, or
demography. The fine-grained interactive nature of sketches facilitates the
application of sketches to various visual understanding tasks, like image
retrieval, image-generation or editing, segmentation, 3D-shape modelling etc.
However, sketches are highly abstract and subjective based on the perception of
individuals. Although most agree that sketches provide fine-grained control to
the user to depict a visual object, many consider sketching a tedious process
due to their limited sketching skills compared to other query/support
modalities like text/tags. Furthermore, collecting fine-grained sketch-photo
association is a significant bottleneck to commercialising sketch applications.
Therefore, this thesis aims to progress sketch-based visual understanding
towards more practicality.Comment: PhD thesis successfully defended by Ayan Kumar Bhunia, Supervisor:
Prof. Yi-Zhe Song, Thesis Examiners: Prof Stella Yu and Prof Adrian Hilto
Semi-Supervised Learning with Scarce Annotations
While semi-supervised learning (SSL) algorithms provide an efficient way to
make use of both labelled and unlabelled data, they generally struggle when the
number of annotated samples is very small. In this work, we consider the
problem of SSL multi-class classification with very few labelled instances. We
introduce two key ideas. The first is a simple but effective one: we leverage
the power of transfer learning among different tasks and self-supervision to
initialize a good representation of the data without making use of any label.
The second idea is a new algorithm for SSL that can exploit well such a
pre-trained representation.
The algorithm works by alternating two phases, one fitting the labelled
points and one fitting the unlabelled ones, with carefully-controlled
information flow between them. The benefits are greatly reducing overfitting of
the labelled data and avoiding issue with balancing labelled and unlabelled
losses during training. We show empirically that this method can successfully
train competitive models with as few as 10 labelled data points per class. More
in general, we show that the idea of bootstrapping features using
self-supervised learning always improves SSL on standard benchmarks. We show
that our algorithm works increasingly well compared to other methods when
refining from other tasks or datasets.Comment: Workshop on Deep Vision, CVPR 202
Colour for the Advancement of Deep Learning in Computer Vision
This thesis explores several research areas for Deep Learning related to computer vision concerning colours. First, this thesis considers one of the most long standing challenges that has remained for Deep Learning which is, how can Deep Learning algorithms learn successfully without using human annotated data? To that end, this thesis examines using colours in images to learn meaningful representations of vision as a substitute for learning from hand-annotated data. Second, is another related topic to the previous, which is the application of Deep Learning to automate the complex graphics task of image colourisation, which is the process of adding colours to black and white images. Third, this thesis explores colour spaces and how the representations of colours in images affect the performance in Deep Learning models
- …