20,760 research outputs found

    Deep semi-supervised segmentation with weight-averaged consistency targets

    Full text link
    Recently proposed techniques for semi-supervised learning such as Temporal Ensembling and Mean Teacher have achieved state-of-the-art results in many important classification benchmarks. In this work, we expand the Mean Teacher approach to segmentation tasks and show that it can bring important improvements in a realistic small data regime using a publicly available multi-center dataset from the Magnetic Resonance Imaging (MRI) domain. We also devise a method to solve the problems that arise when using traditional data augmentation strategies for segmentation tasks on our new training scheme.Comment: 8 pages, 1 figure, accepted for DLMIA/MICCA

    Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

    Get PDF
    Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) . Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.Comment: Submitted to International Conference on Computer Vision Workshop

    Large-Scale Mapping of Human Activity using Geo-Tagged Videos

    Full text link
    This paper is the first work to perform spatio-temporal mapping of human activity using the visual content of geo-tagged videos. We utilize a recent deep-learning based video analysis framework, termed hidden two-stream networks, to recognize a range of activities in YouTube videos. This framework is efficient and can run in real time or faster which is important for recognizing events as they occur in streaming video or for reducing latency in analyzing already captured video. This is, in turn, important for using video in smart-city applications. We perform a series of experiments to show our approach is able to accurately map activities both spatially and temporally. We also demonstrate the advantages of using the visual content over the tags/titles.Comment: Accepted at ACM SIGSPATIAL 201

    Automatic active acoustic target detection in turbulent aquatic environments

    Get PDF
    This work is funded by the Environment and Food Security theme Ph.D. studentship from the University of Aberdeen, the Natural Environment Research Council (NERC) and Department for Environment, Food, and Rural Affairs (Defra grant NE/J004308/1), and the Marine Collaboration Research Forum (MarCRF). We would like to gratefully acknowledge the support from colleagues at Marine Scotland Science.Peer reviewedPublisher PD
    • …
    corecore