2,595 research outputs found
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
Novel data association methods for online multiple human tracking
PhD ThesisVideo-based multiple human tracking has played a crucial role in many applications
such as intelligent video surveillance, human behavior analysis, and
health-care systems. The detection based tracking framework has become
the dominant paradigm in this research eld, and the major task is to accurately
perform the data association between detections across the frames.
However, online multiple human tracking, which merely relies on the detections
given up to the present time for the data association, becomes more
challenging with noisy detections, missed detections, and occlusions. To
address these challenging problems, there are three novel data association
methods for online multiple human tracking are presented in this thesis,
which are online group-structured dictionary learning, enhanced detection
reliability and multi-level cooperative fusion.
The rst proposed method aims to address the noisy detections and
occlusions. In this method, sequential Monte Carlo probability hypothesis
density (SMC-PHD) ltering is the core element for accomplishing the
tracking task, where the measurements are produced by the detection based
tracking framework. To enhance the measurement model, a novel adaptive
gating strategy is developed to aid the classi cation of measurements. In
addition, online group-structured dictionary learning with a maximum voting
method is proposed to estimate robustly the target birth intensity. It
enables the new-born targets in the tracking process to be accurately initialized
from noisy sensor measurements. To improve the adaptability of the
group-structured dictionary to target appearance changes, the simultaneous
codeword optimization (SimCO) algorithm is employed for the dictionary
update.
The second proposed method relates to accurate measurement selection
of detections, which is further to re ne the noisy detections prior to the tracking
pipeline. In order to achieve more reliable measurements in the Gaussian
mixture (GM)-PHD ltering process, a global-to-local enhanced con dence
rescoring strategy is proposed by exploiting the classi cation power of a mask
region-convolutional neural network (R-CNN). Then, an improved pruning
algorithm namely soft-aggregated non-maximal suppression (Soft-ANMS) is
devised to further enhance the selection step. In addition, to avoid the misuse
of ambiguous measurements in the tracking process, person re-identi cation
(ReID) features driven by convolutional neural networks (CNNs) are integrated
to model the target appearances.
The third proposed method focuses on addressing the issues of missed
detections and occlusions. This method integrates two human detectors
with di erent characteristics (full-body and body-parts) in the GM-PHD
lter, and investigates their complementary bene ts for tracking multiple
targets. For each detector domain, a novel discriminative correlation matching
(DCM) model for integration in the feature-level fusion is proposed, and
together with spatio-temporal information is used to reduce the ambiguous
identity associations in the GM-PHD lter. Moreover, a robust fusion
center is proposed within the decision-level fusion to mitigate the sensitivity
of missed detections in the fusion process, thereby improving the fusion
performance and tracking consistency.
The e ectiveness of these proposed methods are investigated using the
MOTChallenge benchmark, which is a framework for the standardized evaluation
of multiple object tracking methods. Detailed evaluations on challenging
video datasets, as well as comparisons with recent state-of-the-art
techniques, con rm the improved multiple human tracking performance
Recommended from our members
Modern Statistical/Machine Learning Techniques for Bio/Neuro-imaging Applications
Developments in modern bio-imaging techniques have allowed the routine collection of a vast amount of data from various techniques. The challenges lie in how to build accurate and efficient models to draw conclusions from the data and facilitate scientific discoveries. Fortunately, recent advances in statistics, machine learning, and deep learning provide valuable tools. This thesis describes some of our efforts to build scalable Bayesian models for four bio-imaging applications: (1) Stochastic Optical Reconstruction Microscopy (STORM) Imaging, (2) particle tracking, (3) voltage smoothing, (4) detect color-labeled neurons in c elegans and assign identity to the detections
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
Sparse Bayesian information filters for localization and mapping
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2008This thesis formulates an estimation framework for Simultaneous Localization and
Mapping (SLAM) that addresses the problem of scalability in large environments.
We describe an estimation-theoretic algorithm that achieves significant gains in computational
efficiency while maintaining consistent estimates for the vehicle pose and
the map of the environment.
We specifically address the feature-based SLAM problem in which the robot represents
the environment as a collection of landmarks. The thesis takes a Bayesian
approach whereby we maintain a joint posterior over the vehicle pose and feature
states, conditioned upon measurement data. We model the distribution as Gaussian
and parametrize the posterior in the canonical form, in terms of the information
(inverse covariance) matrix. When sparse, this representation is amenable to computationally
efficient Bayesian SLAM filtering. However, while a large majority of the
elements within the normalized information matrix are very small in magnitude, it is
fully populated nonetheless. Recent feature-based SLAM filters achieve the scalability
benefits of a sparse parametrization by explicitly pruning these weak links in an effort
to enforce sparsity. We analyze one such algorithm, the Sparse Extended Information
Filter (SEIF), which has laid much of the groundwork concerning the computational
benefits of the sparse canonical form. The thesis performs a detailed analysis of the
process by which the SEIF approximates the sparsity of the information matrix and
reveals key insights into the consequences of different sparsification strategies. We
demonstrate that the SEIF yields a sparse approximation to the posterior that is inconsistent,
suffering from exaggerated confidence estimates. This overconfidence has
detrimental effects on important aspects of the SLAM process and affects the higher
level goal of producing accurate maps for subsequent localization and path planning.
This thesis proposes an alternative scalable filter that maintains sparsity while
preserving the consistency of the distribution. We leverage insights into the natural
structure of the feature-based canonical parametrization and derive a method that
actively maintains an exactly sparse posterior. Our algorithm exploits the structure
of the parametrization to achieve gains in efficiency, with a computational cost that
scales linearly with the size of the map. Unlike similar techniques that sacrifice
consistency for improved scalability, our algorithm performs inference over a posterior
that is conservative relative to the nominal Gaussian distribution. Consequently, we
preserve the consistency of the pose and map estimates and avoid the effects of an
overconfident posterior.
We demonstrate our filter alongside the SEIF and the standard EKF both in simulation
as well as on two real-world datasets. While we maintain the computational
advantages of an exactly sparse representation, the results show convincingly that
our method yields conservative estimates for the robot pose and map that are nearly
identical to those of the original Gaussian distribution as produced by the EKF, but
at much less computational expense.
The thesis concludes with an extension of our SLAM filter to a complex underwater
environment. We describe a systems-level framework for localization and mapping
relative to a ship hull with an Autonomous Underwater Vehicle (AUV) equipped
with a forward-looking sonar. The approach utilizes our filter to fuse measurements
of vehicle attitude and motion from onboard sensors with data from sonar images of
the hull. We employ the system to perform three-dimensional, 6-DOF SLAM on a
ship hull
Theory, Design, and Implementation of Landmark Promotion Cooperative Simultaneous Localization and Mapping
Simultaneous Localization and Mapping (SLAM) is a challenging problem in practice, the use of multiple robots and inexpensive sensors poses even more demands on the designer. Cooperative SLAM poses specific challenges in the areas of computational efficiency, software/network performance, and robustness to errors. New methods in image processing, recursive filtering, and SLAM have been developed to implement practical algorithms for cooperative SLAM on a set of inexpensive robots.
The Consolidated Unscented Mixed Recursive Filter (CUMRF) is designed to handle non-linear systems with non-Gaussian noise. This is accomplished using the Unscented Transform combined with Gaussian Mixture Models. The Robust Kalman Filter is an extension of the Kalman Filter algorithm that improves the ability to remove erroneous observations using Principal Component Analysis (PCA) and the X84 outlier rejection rule. Forgetful SLAM is a local SLAM technique that runs in nearly constant time relative to the number of visible landmarks and improves poor performing sensors through sensor fusion and outlier rejection. Forgetful SLAM correlates all measured observations, but stops the state from growing over time. Hierarchical Active Ripple SLAM (HAR-SLAM) is a new SLAM architecture that breaks the traditional state space of SLAM into a chain of smaller state spaces, allowing multiple robots, multiple sensors, and multiple updates to occur in linear time with linear storage with respect to the number of robots, landmarks, and robots poses. This dissertation presents explicit methods for closing-the-loop, joining multiple robots, and active updates. Landmark Promotion SLAM is a hierarchy of new SLAM methods, using the Robust Kalman Filter, Forgetful SLAM, and HAR-SLAM.
Practical aspects of SLAM are a focus of this dissertation. LK-SURF is a new image processing technique that combines Lucas-Kanade feature tracking with Speeded-Up Robust Features to perform spatial and temporal tracking. Typical stereo correspondence techniques fail at providing descriptors for features, or fail at temporal tracking. Several calibration and modeling techniques are also covered, including calibrating stereo cameras, aligning stereo cameras to an inertial system, and making neural net system models. These methods are important to improve the quality of the data and images acquired for the SLAM process
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
- …