53,418 research outputs found
Appearance-free Tripartite Matching for Multiple Object Tracking
Multiple Object Tracking (MOT) detects the trajectories of multiple objects
given an input video. It has become more and more important for various
research and industry areas, such as cell tracking for biomedical research and
human tracking in video surveillance. Most existing algorithms depend on the
uniqueness of the object's appearance, and the dominating bipartite matching
scheme ignores the speed smoothness. Although several methods have incorporated
the velocity smoothness for tracking, they either fail to pursue global smooth
velocity or are often trapped in local optimums. We focus on the general MOT
problem regardless of the appearance and propose an appearance-free tripartite
matching to avoid the irregular velocity problem of the bipartite matching. The
tripartite matching is formulated as maximizing the likelihood of the state
vectors constituted of the position and velocity of objects, which results in a
chain-dependent structure. We resort to the dynamic programming algorithm to
find such a maximum likelihood estimate. To overcome the high computational
cost induced by the vast search space of dynamic programming when many objects
are to be tracked, we decompose the space by the number of disappearing objects
and propose a reduced-space approach by truncating the decomposition. Extensive
simulations have shown the superiority and efficiency of our proposed method,
and the comparisons with top methods on Cell Tracking Challenge also
demonstrate our competence. We also applied our method to track the motion of
natural killer cells around tumor cells in a cancer study.\footnote{The source
code is available on \url{https://github.com/szcf-weiya/TriMatchMOT}Comment: 36 pages, 14 figure
Background Subtraction in Real Applications: Challenges, Current Models and Future Directions
Computer vision applications based on videos often require the detection of
moving objects in their first step. Background subtraction is then applied in
order to separate the background and the foreground. In literature, background
subtraction is surely among the most investigated field in computer vision
providing a big amount of publications. Most of them concern the application of
mathematical and machine learning models to be more robust to the challenges
met in videos. However, the ultimate goal is that the background subtraction
methods developed in research could be employed in real applications like
traffic surveillance. But looking at the literature, we can remark that there
is often a gap between the current methods used in real applications and the
current methods in fundamental research. In addition, the videos evaluated in
large-scale datasets are not exhaustive in the way that they only covered a
part of the complete spectrum of the challenges met in real applications. In
this context, we attempt to provide the most exhaustive survey as possible on
real applications that used background subtraction in order to identify the
real challenges met in practice, the current used background models and to
provide future directions. Thus, challenges are investigated in terms of
camera, foreground objects and environments. In addition, we identify the
background models that are effectively used in these applications in order to
find potential usable recent background models in terms of robustness, time and
memory requirements.Comment: Submitted to Computer Science Revie
Multiple Object Tracking: A Literature Review
Multiple Object Tracking (MOT) is an important computer vision problem which
has gained increasing attention due to its academic and commercial potential.
Although different kinds of approaches have been proposed to tackle this
problem, it still remains challenging due to factors like abrupt appearance
changes and severe object occlusions. In this work, we contribute the first
comprehensive and most recent review on this problem. We inspect the recent
advances in various aspects and propose some interesting directions for future
research. To the best of our knowledge, there has not been any extensive review
on this topic in the community. We endeavor to provide a thorough review on the
development of this problem in recent decades. The main contributions of this
review are fourfold: 1) Key aspects in a multiple object tracking system,
including formulation, categorization, key principles, evaluation of an MOT are
discussed. 2) Instead of enumerating individual works, we discuss existing
approaches according to various aspects, in each of which methods are divided
into different groups and each group is discussed in detail for the principles,
advances and drawbacks. 3) We examine experiments of existing publications and
summarize results on popular datasets to provide quantitative comparisons. We
also point to some interesting discoveries by analyzing these results. 4) We
provide a discussion about issues of MOT research, as well as some interesting
directions which could possibly become potential research effort in the future
A Survey on Content-Aware Video Analysis for Sports
Sports data analysis is becoming increasingly large-scale, diversified, and
shared, but difficulty persists in rapidly accessing the most crucial
information. Previous surveys have focused on the methodologies of sports video
analysis from the spatiotemporal viewpoint instead of a content-based
viewpoint, and few of these studies have considered semantics. This study
develops a deeper interpretation of content-aware sports video analysis by
examining the insight offered by research into the structure of content under
different scenarios. On the basis of this insight, we provide an overview of
the themes particularly relevant to the research on content-aware systems for
broadcast sports. Specifically, we focus on the video content analysis
techniques applied in sportscasts over the past decade from the perspectives of
fundamentals and general review, a content hierarchical model, and trends and
challenges. Content-aware analysis methods are discussed with respect to
object-, event-, and context-oriented groups. In each group, the gap between
sensation and content excitement must be bridged using proper strategies. In
this regard, a content-aware approach is required to determine user demands.
Finally, the paper summarizes the future trends and challenges for sports video
analysis. We believe that our findings can advance the field of research on
content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles
A novel probabilistic perception algorithm is presented as a real-time joint
solution to data association, object tracking, and object classification for an
autonomous ground vehicle in all-weather conditions. The presented algorithm
extends a Rao-Blackwellized Particle Filter originally built with a particle
filter for data association and a Kalman filter for multi-object tracking
(Miller et al. 2011a) to now also include multiple model tracking for
classification. Additionally a state-of-the-art vision detection algorithm that
includes heading information for autonomous ground vehicle (AGV) applications
was implemented. Cornell's AGV from the DARPA Urban Challenge was upgraded and
used to experimentally examine if and how state-of-the-art vision algorithms
can complement or replace lidar and radar sensors. Sensor and algorithm
performance in adverse weather and lighting conditions is tested. Experimental
evaluation demonstrates robust all-weather data association, tracking, and
classification where camera, lidar, and radar sensors complement each other
inside the joint probabilistic perception algorithm.Comment: 35 pages, 21 figures, 14 table
Temporal Dynamic Appearance Modeling for Online Multi-Person Tracking
Robust online multi-person tracking requires the correct associations of
online detection responses with existing trajectories. We address this problem
by developing a novel appearance modeling approach to provide accurate
appearance affinities to guide data association. In contrast to most existing
algorithms that only consider the spatial structure of human appearances, we
exploit the temporal dynamic characteristics within temporal appearance
sequences to discriminate different persons. The temporal dynamic makes a
sufficient complement to the spatial structure of varying appearances in the
feature space, which significantly improves the affinity measurement between
trajectories and detections. We propose a feature selection algorithm to
describe the appearance variations with mid-level semantic features, and
demonstrate its usefulness in terms of temporal dynamic appearance modeling.
Moreover, the appearance model is learned incrementally by alternatively
evaluating newly-observed appearances and adjusting the model parameters to be
suitable for online tracking. Reliable tracking of multiple persons in complex
scenes is achieved by incorporating the learned model into an online
tracking-by-detection framework. Our experiments on the challenging benchmark
MOTChallenge 2015 demonstrate that our method outperforms the state-of-the-art
multi-person tracking algorithms
Detection, Recognition and Tracking of Moving Objects from Real-time Video via SP Theory of Intelligence and Species Inspired PSO
In this paper, we address the basic problem of recognizing moving objects in
video images using SP Theory of Intelligence. The concept of SP Theory of
Intelligence which is a framework of artificial intelligence, was first
introduced by Gerard J Wolff, where S stands for Simplicity and P stands for
Power. Using the concept of multiple alignment, we detect and recognize object
of our interest in video frames with multilevel hierarchical parts and
subparts, based on polythetic categories. We track the recognized objects using
the species based Particle Swarm Optimization (PSO). First, we extract the
multiple alignment of our object of interest from training images. In order to
recognize accurately and handle occlusion, we use the polythetic concepts on
raw data line to omit the redundant noise via searching for best alignment
representing the features from the extracted alignments. We recognize the
domain of interest from the video scenes in form of wide variety of multiple
alignments to handle scene variability. Unsupervised learning is done in the SP
model following the DONSVIC principle and natural structures are discovered via
information compression and pattern analysis. After successful recognition of
objects, we use species based PSO algorithm as the alignments of our object of
interest is analogues to observation likelihood and fitness ability of species.
Subsequently, we analyze the competition and repulsion among species with
annealed Gaussian based PSO. We have tested our algorithms on David, Walking2,
FaceOcc1, Jogging and Dudek, obtaining very satisfactory and competitive
results
FlightGoggles: A Modular Framework for Photorealistic Camera, Exteroceptive Sensor, and Dynamics Simulation
FlightGoggles is a photorealistic sensor simulator for perception-driven
robotic vehicles. The key contributions of FlightGoggles are twofold. First,
FlightGoggles provides photorealistic exteroceptive sensor simulation using
graphics assets generated with photogrammetry. Second, it provides the ability
to combine (i) synthetic exteroceptive measurements generated in silico in real
time and (ii) vehicle dynamics and proprioceptive measurements generated in
motio by vehicle(s) in a motion-capture facility. FlightGoggles is capable of
simulating a virtual-reality environment around autonomous vehicle(s). While a
vehicle is in flight in the FlightGoggles virtual reality environment,
exteroceptive sensors are rendered synthetically in real time while all complex
extrinsic dynamics are generated organically through the natural interactions
of the vehicle. The FlightGoggles framework allows for researchers to
accelerate development by circumventing the need to estimate complex and
hard-to-model interactions such as aerodynamics, motor mechanics, battery
electrochemistry, and behavior of other agents. The ability to perform
vehicle-in-the-loop experiments with photorealistic exteroceptive sensor
simulation facilitates novel research directions involving, e.g., fast and
agile autonomous flight in obstacle-rich environments, safe human interaction,
and flexible sensor selection. FlightGoggles has been utilized as the main test
for selecting nine teams that will advance in the AlphaPilot autonomous drone
racing challenge. We survey approaches and results from the top AlphaPilot
teams, which may be of independent interest.Comment: Initial version appeared at IROS 2019. Supplementary material can be
found at https://flightgoggles.mit.edu. Revision includes description of new
FlightGoggles features, such as a photogrammetric model of the MIT Stata
Center, new rendering settings, and a Python AP
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
- …