20,760 research outputs found
Deep semi-supervised segmentation with weight-averaged consistency targets
Recently proposed techniques for semi-supervised learning such as Temporal
Ensembling and Mean Teacher have achieved state-of-the-art results in many
important classification benchmarks. In this work, we expand the Mean Teacher
approach to segmentation tasks and show that it can bring important
improvements in a realistic small data regime using a publicly available
multi-center dataset from the Magnetic Resonance Imaging (MRI) domain. We also
devise a method to solve the problems that arise when using traditional data
augmentation strategies for segmentation tasks on our new training scheme.Comment: 8 pages, 1 figure, accepted for DLMIA/MICCA
Recommended from our members
Supporting Story Synthesis: Bridging the Gap between Visual Analytics and Storytelling
Visual analytics usually deals with complex data and uses sophisticated algorithmic, visual, and interactive techniques. Findings of the analysis often need to be communicated to an audience that lacks visual analytics expertise. This requires analysis outcomes to be presented in simpler ways than that are typically used in visual analytics systems. However, not only analytical visualizations may be too complex for target audience but also the information that needs to be presented. Hence, there exists a gap on the path from obtaining analysis findings to communicating them, which involves two aspects: information and display complexity. We propose a general framework where data analysis and result presentation are linked by story synthesis, in which the analyst creates and organizes story contents. Differently, from the previous research, where analytic findings are represented by stored display states, we treat findings as data constructs. In story synthesis, findings are selected, assembled, and arranged in views using meaningful layouts that take into account the structure of information and inherent properties of its components. We propose a workflow for applying the proposed framework in designing visual analytics systems and demonstrate the generality of the approach by applying it to two domains, social media, and movement analysis
Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction
Frame-level visual features are generally aggregated in time with the
techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust
video-level representation. We here introduce a learnable aggregation technique
whose primary objective is to retain short-time temporal structure between
frame-level features and their spatial interdependencies in the representation.
Also, it can be easily adapted to the cases where there have very scarce
training samples. We evaluate the method on a real-fake expression prediction
dataset to demonstrate its superiority. Our method obtains 65% score on the
test dataset in the official MAP evaluation and there is only one misclassified
decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) .
Lastly, we believe that this method can be extended to different problems such
as action/event recognition in future.Comment: Submitted to International Conference on Computer Vision Workshop
Recommended from our members
Visual Analytic Design for Detecting Airborne Pollution Sources
Using the VAST Challenge 2017 dataset as an illustration, the design choices of a visual analytic system for predicting the source of air pollution is described. Probabilistic Source Cones are visual symbols representing the probability of source location of a pollution event. Using transparency to indicate probability, multiple cones may be overlaid in order to provide a fuzzy triangulation of likely sources. This enabled the correct prediction and elimination of pollution sources at a precision far in excess of the spatial density of the sensors themselves
Large-Scale Mapping of Human Activity using Geo-Tagged Videos
This paper is the first work to perform spatio-temporal mapping of human
activity using the visual content of geo-tagged videos. We utilize a recent
deep-learning based video analysis framework, termed hidden two-stream
networks, to recognize a range of activities in YouTube videos. This framework
is efficient and can run in real time or faster which is important for
recognizing events as they occur in streaming video or for reducing latency in
analyzing already captured video. This is, in turn, important for using video
in smart-city applications. We perform a series of experiments to show our
approach is able to accurately map activities both spatially and temporally. We
also demonstrate the advantages of using the visual content over the
tags/titles.Comment: Accepted at ACM SIGSPATIAL 201
Automatic active acoustic target detection in turbulent aquatic environments
This work is funded by the Environment and Food Security theme Ph.D. studentship from the University of Aberdeen, the Natural Environment Research Council (NERC) and Department for Environment, Food, and Rural Affairs (Defra grant NE/J004308/1), and the Marine Collaboration Research Forum (MarCRF). We would like to gratefully acknowledge the support from colleagues at Marine Scotland Science.Peer reviewedPublisher PD
- …