461,336 research outputs found
Visual analysis with limited supervision
University of Technology Sydney. Faculty of Engineering and Information Technology.Visual analysis is an attractive research topic in the field of computer vision. In the visual analysis, there are two critical directions, visual retrieval and visual classification. In recent years, visual retrieval has been investigated and developed in many real-world applications, for instance, in person re-identification. On the other hand, visual classification is also widely studied, such as in image classification. Typical visual analysis methods are supervised learning algorithms. In such algorithms, extensive labeled data is demanded for training supervised models in order to achieve acceptable performance. However, it is difficult to collect and generate annotated data in the real world due to the limited resources, such as human labor for annotation. Therefore, it is urgent to develop methods to complete the visual analysis mission with limited supervision.
In this thesis, we propose to address the visual analysis problem with limited supervision. Specifically, we treat limited supervision problem in three scenarios according to the amount of labeled data. In the first scenario, no labeled data are provided and only limited human labor for annotation is available; In the second scenario, scarce labeled data and abundant unlabeled data are accessible. In the third scenario, only few instances in the target dataset are labeled and there are multiple sources of labeled data from different domains.
In Chapter 2 and Chapter 3, we discuss the first scenario, when no labeled data are provided, and only limited human labor for annotation is available. We propose to solve the problem via active learning. Unlike conventional active learning, which usually starts with a set of labeled data as the reference, in this thesis, we adopt the active learning algorithm with no pre-given labeled data. We refer these algorithms as the Early Active Learning. In this thesis, first, we attempt to select the most contributive instances for annotation and later being utilized for training supervised models. We demonstrate that even by annotating a few selected instances, the proposed method can achieve comparable performance in the visual retrieval. Second, we further extend the instance based active learning to pair-based early active learning. Other than select instances for annotation, the pair-based early active learning selects the most informative pairs for annotation, which is essential in the visual retrieval.
In Chapter 4, in the second scenario, we address the visual retrieval problem when there are scarce labeled data and abundant unlabeled data. In this thesis, we propose to utilize both the labeled and the unlabeled data in a semi-supervised attribute learning schema. The proposed method could jointly learn the latent attributes with appropriate dimensions and estimate the pairwise probability of the data simultaneously.
In Chapter 5 and Chapter 6, in the third scenario, we focus on visual classification with few or no labels, but there are pre-known labeled data from other domains. To improve the performance in the target domain, we adopt transfer learning algorithms to transfer helpful knowledge from the pre-known (source) domain with labeled data. First, in Chapter 5, the few-shot visual classification problem is considered. We have access to multiple source datasets with well-labeled data but can only access a limited set of labeled data in the target dataset. An Analogical Transfer Learning schema is proposed for this problem. It attempts to transfer the knowledge from the source domains to enhance the performance of the target domain models. In the algorithm, an analogy-revision schema is designed to select only the helpful source instances to enhance the target domain models. Second, in Chapter 6, we challenge a more difficult problem when there is no labeled data in the target domain in the visual retrieval problem. A Domain-aware Unsupervised Cross-dataset Transfer Learning algorithm is proposed to address this problem. The importance of universal and domain-unique appearances are valued simultaneously and jointly contribute to the representation learning. It manages to leverage the common and domain-unique representations across datasets in the unsupervised visual retrieval
Fine-Grained Product Class Recognition for Assisted Shopping
Assistive solutions for a better shopping experience can improve the quality
of life of people, in particular also of visually impaired shoppers. We present
a system that visually recognizes the fine-grained product classes of items on
a shopping list, in shelves images taken with a smartphone in a grocery store.
Our system consists of three components: (a) We automatically recognize useful
text on product packaging, e.g., product name and brand, and build a mapping of
words to product classes based on the large-scale GroceryProducts dataset. When
the user populates the shopping list, we automatically infer the product class
of each entered word. (b) We perform fine-grained product class recognition
when the user is facing a shelf. We discover discriminative patches on product
packaging to differentiate between visually similar product classes and to
increase the robustness against continuous changes in product design. (c) We
continuously improve the recognition accuracy through active learning. Our
experiments show the robustness of the proposed method against cross-domain
challenges, and the scalability to an increasing number of products with
minimal re-training.Comment: Accepted at ICCV Workshop on Assistive Computer Vision and Robotics
(ICCV-ACVR) 201
Cross-dataset domain adaptation for the classification COVID-19 using chest computed tomography images
Detecting COVID-19 patients using Computed Tomography (CT) images of the
lungs is an active area of research. Datasets of CT images from COVID-19
patients are becoming available. Deep learning (DL) solutions and in particular
Convolutional Neural Networks (CNN) have achieved impressive results for the
classification of COVID-19 CT images, but only when the training and testing
take place within the same dataset. Work on the cross-dataset problem is still
limited and the achieved results are low. Our work tackles the cross-dataset
problem through a Domain Adaptation (DA) technique with deep learning. Our
proposed solution, COVID19-DANet, is based on pre-trained CNN backbone for
feature extraction. For this task, we select the pre-trained Efficientnet-B3
CNN because it has achieved impressive classification accuracy in previous
work. The backbone CNN is followed by a prototypical layer which is a concept
borrowed from prototypical networks in few-shot learning (FSL). It computes a
cosine distance between given samples and the class prototypes and then
converts them to class probabilities using the Softmax function. To train the
COVID19-DANet model, we propose a combined loss function that is composed of
the standard cross-entropy loss for class discrimination and another entropy
loss computed over the unlabelled target set only. This so-called unlabelled
target entropy loss is minimized and maximized in an alternative fashion, to
reach the two objectives of class discrimination and domain invariance.
COVID19-DANet is tested under four cross-dataset scenarios using the
SARS-CoV-2-CT and COVID19-CT datasets and has achieved encouraging results
compared to recent work in the literature.Comment: 31 pages, 15 figure
Nordic Post-Graduate Sustainable Design and Engineering Research from a Supervisor Perspective
The multi- and interdisciplinary field of sustainable product innovation is rapidly expanding as an arena for scientific research. Universities in Nordic countries can be considered as an exponent of this type of research, with active research groups in, among others, Göteborg, Helsinki, Lund, Lyngby, Linköping and Trondheim. In the context of a Nordforsk funded project, seven second generation PhD supervisors from these universities, who have been active in this field for many years, discuss funding, publication, research traditions, education and supervision practices related to PhD research in this field. A number of recommendations to improve current practices are made, including the mapping currently existing differences in different academic institutions, studying the cross-over learning effects between academica and non-academic partners, and the development of ‘quality indicators’ of research in the SPI domain
ActiveSelfHAR: Incorporating Self Training into Active Learning to Improve Cross-Subject Human Activity Recognition
Deep learning-based human activity recognition (HAR) methods have shown great
promise in the applications of smart healthcare systems and wireless body
sensor network (BSN). Despite their demonstrated performance in laboratory
settings, the real-world implementation of such methods is still hindered by
the cross-subject issue when adapting to new users. To solve this issue, we
propose ActiveSelfHAR, a framework that combines active learning's benefit of
sparsely acquiring data with actual labels and self- training's benefit of
effectively utilizing unlabeled data to enable the deep model to adapt to the
target domain, i.e., the new users. In this framework, the model trained in the
last iteration or the source domain is first utilized to generate pseudo labels
of the target-domain samples and construct a self-training set based on the
confidence score. Second, we propose to use the spatio-temporal relationships
among the samples in the non-self-training set to augment the core set selected
by active learning. Finally, we combine the self-training set and the augmented
core set to fine-tune the model. We demonstrate our method by comparing it with
state-of-the-art methods on two IMU-based datasets and an EMG-based dataset.
Our method presents similar HAR accuracies with the upper bound, i.e. fully
supervised fine-tuning with less than 1\% labeled data of the target dataset
and significantly improves data efficiency and time cost. Our work highlights
the potential of implementing user-independent HAR methods into smart
healthcare systems and BSN
Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERvA experiment
We present a simulation-based study using deep convolutional neural networks
(DCNNs) to identify neutrino interaction vertices in the MINERvA passive
targets region, and illustrate the application of domain adversarial neural
networks (DANNs) in this context. DANNs are designed to be trained in one
domain (simulated data) but tested in a second domain (physics data) and
utilize unlabeled data from the second domain so that during training only
features which are unable to discriminate between the domains are promoted.
MINERvA is a neutrino-nucleus scattering experiment using the NuMI beamline at
Fermilab. -dependent cross sections are an important part of the physics
program, and these measurements require vertex finding in complicated events.
To illustrate the impact of the DANN we used a modified set of simulation in
place of physics data during the training of the DANN and then used the label
of the modified simulation during the evaluation of the DANN. We find that deep
learning based methods offer significant advantages over our prior track-based
reconstruction for the task of vertex finding, and that DANNs are able to
improve the performance of deep networks by leveraging available unlabeled data
and by mitigating network performance degradation rooted in biases in the
physics models used for training.Comment: 41 page
- …