4,904 research outputs found
Thermo-visual feature fusion for object tracking using multiple spatiogram trackers
In this paper, we propose a framework that can efficiently combine features for robust tracking based on fusing the outputs of multiple spatiogram trackers. This is achieved without the exponential increase in storage and processing that other multimodal tracking approaches suffer from. The framework allows the features to be split arbitrarily between the trackers, as well as providing the flexibility to add, remove or dynamically weight features. We derive a mean-shift type algorithm for the framework that allows efficient object tracking with very low computational overhead. We especially target the fusion of thermal infrared and visible spectrum features as the most useful features for automated surveillance applications. Results are shown on multimodal video sequences clearly illustrating the benefits of combining multiple features using our framework
Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos
We propose a new zero-shot Event Detection method by Multi-modal
Distributional Semantic embedding of videos. Our model embeds object and action
concepts as well as other available modalities from videos into a
distributional semantic space. To our knowledge, this is the first Zero-Shot
event detection model that is built on top of distributional semantics and
extends it in the following directions: (a) semantic embedding of multimodal
information in videos (with focus on the visual modalities), (b) automatically
determining relevance of concepts/attributes to a free text query, which could
be useful for other applications, and (c) retrieving videos by free text event
query (e.g., "changing a vehicle tire") based on their content. We embed videos
into a distributional semantic space and then measure the similarity between
videos and the event query in a free text form. We validated our method on the
large TRECVID MED (Multimedia Event Detection) challenge. Using only the event
title as a query, our method outperformed the state-of-the-art that uses big
descriptions from 12.6% to 13.5% with MAP metric and 0.73 to 0.83 with ROC-AUC
metric. It is also an order of magnitude faster.Comment: To appear in AAAI 201
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
Multispectral object segmentation and retrieval in surveillance video
This paper describes a system for object segmentation and feature extraction for surveillance video. Segmentation is performed by a dynamic vision system that fuses information from thermal infrared video with standard CCTV video in order to detect and track objects. Separate background modelling in each modality and dynamic mutual information based thresholding are used to provide initial foreground candidates for tracking. The belief in the validity of these candidates is ascertained using knowledge of foreground pixels and temporal linking of candidates. The transferable belief model is used to combine these sources of information and segment objects. Extracted objects are subsequently tracked using adaptive thermo-visual appearance models. In order to facilitate search and classification of objects in large archives, retrieval features from both modalities are extracted for tracked objects. Overall system performance is demonstrated in a simple retrieval scenari
- …