758 research outputs found

    Learning feed-forward one-shot learners

    Full text link
    One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark.Comment: The first three authors contributed equally, and are listed in alphabetical orde

    Enhancing endoscopic navigation and polyp detection using artificial intelligence

    Get PDF
    Colorectal cancer (CRC) is one most common and deadly forms of cancer. It has a very high mortality rate if the disease advances to late stages however early diagnosis and treatment can be curative is hence essential to enhancing disease management. Colonoscopy is considered the gold standard for CRC screening and early therapeutic treatment. The effectiveness of colonoscopy is highly dependent on the operator’s skill, as a high level of hand-eye coordination is required to control the endoscope and fully examine the colon wall. Because of this, detection rates can vary between different gastroenterologists and technology have been proposed as solutions to assist disease detection and standardise detection rates. This thesis focuses on developing artificial intelligence algorithms to assist gastroenterologists during colonoscopy with the potential to ensure a baseline standard of quality in CRC screening. To achieve such assistance, the technical contributions develop deep learning methods and architectures for automated endoscopic image analysis to address both the detection of lesions in the endoscopic image and the 3D mapping of the endoluminal environment. The proposed detection models can run in real-time and assist visualization of different polyp types. Meanwhile the 3D reconstruction and mapping models developed are the basis for ensuring that the entire colon has been examined appropriately and to support quantitative measurement of polyp sizes using the image during a procedure. Results and validation studies presented within the thesis demonstrate how the developed algorithms perform on both general scenes and on clinical data. The feasibility of clinical translation is demonstrated for all of the models on endoscopic data from human participants during CRC screening examinations

    Using Neural Networks to identify Individual Animals from Photographs

    Get PDF
    Effective management needs to know sizes of animal populations. This can be accomplished in various ways, but a very popular way is mark-recapture studies. Mark-recapture studies need a way of telling if a captured animal has been previously seen. For traditional mark-recapture, this is achieved by applying a tag to the animal. For non-invasive mark-recapture methods which exploit photographs, there is no tag on the animal’s body. As a result, these methods require animals to be individually identifiable. They assess if an animal has been caught before by examining photographs for animals which have individual-specific marks (Cross et al., 2014; Gomez et al., 2016; Beijbom et al., 2016; Körschens, Barz, and Denzler, 2018). This study develops a model which can reliably match photographs of the same individual based on individual-specific marks. The model consists of two main parts, an object detection model, and a classifier which takes two photos as input and outputs a predicted probability that the pair is from the same individual (a match). The object detection model is a convolutional neural network (CNN) and the matching classifier is a special kind of CNN called a siamese network. The siamese network uses a pair of CNNs that share weights to summarise the images, followed by some dense layers which combine the summaries into measures of similarity which can be used to predict a match. The model is tested on two case studies, humpback whales (HBWs) and western leopard toads (WLTs). The HBW dataset consists of images originally collected by various institutions across the globe and uploaded to the Happywhale platform which encourages scientists to identify individual mammals. HBWs can be identified by their fins and specials markings. There is lots of data for this problem. The WLT dataset consists of images collected by citizen scientists in South Africa. They were either uploaded to iSpot, a citizen science project which collects images or sent to the (WLT) project, a conservation project staffed by volunteers. WLTs can be identified by their unique spots. There is a little data for this problem. One part of this dataset consists of labelled individuals and another part is unlabelled. The model was able to give good results for both HBWs and WLTs. In 95% of the cases the model managed to correctly identify if a pair of images is from the same HBW individual or not. It accurately identified if a pair of images is drawn from the same WLT individual or not in 87% of the cases. This study also assessed the effectiveness of the semi-supervised approach on the WLT unlabelled dataset. In this study, the semisupervised approach has been partially successful. The model was able to identify new individuals and matches which were not identified before, but they were relatively few in numbers. Without an exhaustive check of the data, it is not clear whether this is due to the failure of the semi-supervised approach, or because there are not many matches in the data. After adding the newly identified and labelled individuals to the WLT labelled dataset, the model slightly improved its performance and correctly identified 89% of WLT pairs. A number of computer-aided photo-matching algorithms have been proposed (Matthé et al., 2017). This study also assessed the performance of Wild-ID (Bolger et al., 2012), one of the commonly used photo-matching algorithm on both HBW and WLT datasets. The model developed in this thesis achieved very competitive results compared with Wild-ID. Model accuracies for the proposed siamese network were much higher than those returned by Wild-ID on the HBW dataset, and roughly the same on the WLT dataset

    A Survey on Self-Supervised Representation Learning

    Full text link
    Learning meaningful representations is at the heart of many tasks in the field of modern machine learning. Recently, a lot of methods were introduced that allow learning of image representations without supervision. These representations can then be used in downstream tasks like classification or object detection. The quality of these representations is close to supervised learning, while no labeled images are needed. This survey paper provides a comprehensive review of these methods in a unified notation, points out similarities and differences of these methods, and proposes a taxonomy which sets these methods in relation to each other. Furthermore, our survey summarizes the most-recent experimental results reported in the literature in form of a meta-study. Our survey is intended as a starting point for researchers and practitioners who want to dive into the field of representation learning
    • …
    corecore