1,034 research outputs found
Point-wise mutual information-based video segmentation with high temporal consistency
In this paper, we tackle the problem of temporally consistent boundary
detection and hierarchical segmentation in videos. While finding the best
high-level reasoning of region assignments in videos is the focus of much
recent research, temporal consistency in boundary detection has so far only
rarely been tackled. We argue that temporally consistent boundaries are a key
component to temporally consistent region assignment. The proposed method is
based on the point-wise mutual information (PMI) of spatio-temporal voxels.
Temporal consistency is established by an evaluation of PMI-based point
affinities in the spectral domain over space and time. Thus, the proposed
method is independent of any optical flow computation or previously learned
motion models. The proposed low-level video segmentation method outperforms the
learning-based state of the art in terms of standard region metrics
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Learning to Segment Moving Objects in Videos
We segment moving objects in videos by ranking spatio-temporal segment
proposals according to "moving objectness": how likely they are to contain a
moving object. In each video frame, we compute segment proposals using multiple
figure-ground segmentations on per frame motion boundaries. We rank them with a
Moving Objectness Detector trained on image and motion fields to detect moving
objects and discard over/under segmentations or background parts of the scene.
We extend the top ranked segments into spatio-temporal tubes using random
walkers on motion affinities of dense point trajectories. Our final tube
ranking consistently outperforms previous segmentation methods in the two
largest video segmentation benchmarks currently available, for any number of
proposals. Further, our per frame moving object proposals increase the
detection rate up to 7\% over previous state-of-the-art static proposal
methods
Learning Video Object Segmentation with Visual Memory
This paper addresses the task of segmenting moving objects in unconstrained
videos. We introduce a novel two-stream neural network with an explicit memory
module to achieve this. The two streams of the network encode spatial and
temporal features in a video sequence respectively, while the memory module
captures the evolution of objects over time. The module to build a "visual
memory" in video, i.e., a joint representation of all the video frames, is
realized with a convolutional recurrent unit learned from a small number of
training video sequences. Given a video frame as input, our approach assigns
each pixel an object or background label based on the learned spatio-temporal
features as well as the "visual memory" specific to the video, acquired
automatically without any manually-annotated frames. The visual memory is
implemented with convolutional gated recurrent units, which allows to propagate
spatial information over time. We evaluate our method extensively on two
benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show
state-of-the-art results. For example, our approach outperforms the top method
on the DAVIS dataset by nearly 6%. We also provide an extensive ablative
analysis to investigate the influence of each component in the proposed
framework
Spectral-spatial classification of n-dimensional images in real-time based on segmentation and mathematical morphology on GPUs
The objective of this thesis is to develop efficient schemes for spectral-spatial n-dimensional image
classification. By efficient schemes, we mean schemes that produce good classification results in
terms of accuracy, as well as schemes that can be executed in real-time on low-cost computing
infrastructures, such as the Graphics Processing Units (GPUs) shipped in personal computers. The
n-dimensional images include images with two and three dimensions, such as images coming from
the medical domain, and also images ranging from ten to hundreds of dimensions, such as the multiand
hyperspectral images acquired in remote sensing.
In image analysis, classification is a regularly used method for information retrieval in areas such as
medical diagnosis, surveillance, manufacturing and remote sensing, among others. In addition, as
the hyperspectral images have been widely available in recent years owing to the reduction in the
size and cost of the sensors, the number of applications at lab scale, such as food quality control, art
forgery detection, disease diagnosis and forensics has also increased. Although there are many
spectral-spatial classification schemes, most are computationally inefficient in terms of execution
time. In addition, the need for efficient computation on low-cost computing infrastructures is
increasing in line with the incorporation of technology into everyday applications.
In this thesis we have proposed two spectral-spatial classification schemes: one based on
segmentation and other based on wavelets and mathematical morphology. These schemes were
designed with the aim of producing good classification results and they perform better than other
schemes found in the literature based on segmentation and mathematical morphology in terms of
accuracy. Additionally, it was necessary to develop techniques and strategies for efficient GPU
computing, for example, a block–asynchronous strategy, resulting in an efficient implementation on
GPU of the aforementioned spectral-spatial classification schemes. The optimal GPU parameters
were analyzed and different data partitioning and thread block arrangements were studied to exploit
the GPU resources. The results show that the GPU is an adequate computing platform for on-board
processing of hyperspectral information
Computational imaging and automated identification for aqueous environments
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2011Sampling the vast volumes of the ocean requires tools capable of observing from a distance while retaining detail necessary for biology and ecology, ideal for optical methods.
Algorithms that work with existing SeaBED AUV imagery are developed, including habitat classi fication with bag-of-words models and multi-stage boosting for rock sh detection.
Methods for extracting images of sh from videos of longline operations are demonstrated.
A prototype digital holographic imaging device is designed and tested for quantitative
in situ microscale imaging. Theory to support the device is developed, including particle
noise and the effects of motion. A Wigner-domain model provides optimal settings and
optical limits for spherical and planar holographic references.
Algorithms to extract the information from real-world digital holograms are created.
Focus metrics are discussed, including a novel focus detector using local Zernike moments.
Two methods for estimating lateral positions of objects in holograms without reconstruction
are presented by extending a summation kernel to spherical references and using a local
frequency signature from a Riesz transform. A new metric for quickly estimating object
depths without reconstruction is proposed and tested. An example application, quantifying
oil droplet size distributions in an underwater plume, demonstrates the efficacy of the
prototype and algorithms.Funding was provided by NOAA Grant #5710002014, NOAA NMFS Grant #NA17RJ1223, NSF Grant #OCE-0925284, and NOAA Grant #NA10OAR417008
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
- …