15 research outputs found

    Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring

    Get PDF
    We address the problem of learning self-supervised representations from unlabeled image collections. Unlike existing approaches that attempt to learn useful features by maximizing similarity between augmented versions of each input image or by speculatively picking negative samples, we instead also make use of the natural variation that occurs in image collections that are captured using static monitoring cameras. To achieve this, we exploit readily available context data that encodes information such as the spatial and temporal relationships between the input images. We are able to learn representations that are surprisingly effective for downstream supervised classification, by first identifying high probability positive pairs at training time, i.e. those images that are likely to depict the same visual concept. For the critical task of global biodiversity monitoring, this results in image features that can be adapted to challenging visual species classification tasks with limited human supervision. We present results on four different camera trap image collections, across three different families of self-supervised learning methods, and show that careful image selection at training time results in superior performance compared to existing baselines such as conventional self-supervised training and transfer learning

    Patch based synthesis for single depth image super-resolution

    Get PDF
    We present an algorithm to synthetically increase the resolution of a solitary depth image using only a generic database of local patches. Modern range sensors measure depths with non-Gaussian noise and at lower starting resolutions than typical visible-light cameras. While patch based approaches for upsampling intensity images continue to improve, this is the first exploration of patching for depth images. We match against the height field of each low resolution input depth patch, and search our database for a list of appropriate high resolution candidate patches. Selecting the right candidate at each location in the depth image is then posed as a Markov random field labeling problem. Our experiments also show how important further depth-specific processing, such as noise removal and correct patch normalization, dramatically improves our results. Perhaps surprisingly, even better results are achieved on a variety of real test scenes by providing our algorithm with only synthetic training depth data

    The temporal opportunist: self-supervised multi-frame monocular depth

    No full text
    Self-supervised monocular depth estimation networks are trained to predict scene depth using nearby frames as a supervision signal during training. However, for many applications, sequence information in the form of video frames is also available at test time. The vast majority of monocular networks do not make use of this extra signal, thus ignoring valuable information that could be used to improve the predicted depth. Those that do, either use computationally expensive test-time refinement techniques or off-the- shelf recurrent networks, which only indirectly make use of the geometric information that is inherently available.We propose ManyDepth, an adaptive approach to dense depth estimation that can make use of sequence information at test time, when it is available. Taking inspiration from multi-view stereo, we propose a deep end-to-end cost volume based approach that is trained using self-supervision only. We present a novel consistency loss that encourages the network to ignore the cost volume when it is deemed unreliable, e.g. in the case of moving objects, and an augmentation scheme to cope with static cameras. Our detailed experiments on both KITTI and Cityscapes show that we outperform all published self-supervised baselines, including those that use single or multiple frames at test time

    Ecology Meets Computer Science:Designing Tools to Reconcile People, Data, and Practices

    No full text
    Ecoacoustics draws together computer scientists and ecologists to achieve an understanding of ecosystems and wildlife using acoustic recordings of the environment. Computer scientists are challenged to manage increasingly large datasets while developing analytic and visualisation tools. Ecologists struggle to find and use tools that answer highly heterogeneous research questions. These two fields are naturally drawn together at the tool interface, however, less attention has been paid to how their practices influence tool design and use. We interviewed and collected email correspondence from four computer scientists and eight ecologists to learn how their practices indicate opportunities for reconciling difference through design. We found that different temporal rhythms, relationships to data, and data-driven questions demand tool configuration, data integration, and standardisation. This research outlines interfacing opportunities for new ecological research utilising large acoustic datasets, and also contributes to evolving HCI approaches in areas making use of big data and human-in-the-loop processes
    corecore