5,719 research outputs found
Trajectory based video analysis in multi-camera setups
PhDThis thesis presents an automated framework for activity analysis in multi-camera
setups. We start with the calibration of cameras particularly without overlapping
views. An algorithm is presented that exploits trajectory observations in each view
and works iteratively on camera pairs. First outliers are identified and removed
from observations of each camera. Next, spatio-temporal information derived from
the available trajectory is used to estimate unobserved trajectory segments in areas
uncovered by the cameras. The unobserved trajectory estimates are used to estimate
the relative position of each camera pair, whereas the exit-entrance direction of
each object is used to estimate their relative orientation. The process continues and
iteratively approximates the configuration of all cameras with respect to each other.
Finally, we refi ne the initial configuration estimates with bundle adjustment, based
on the observed and estimated trajectory segments. For cameras with overlapping
views, state-of-the-art homography based approaches are used for calibration.
Next we establish object correspondence across multiple views. Our algorithm
consists of three steps, namely association, fusion and linkage. For association,
local trajectory pairs corresponding to the same physical object are estimated using
multiple spatio-temporal features on a common ground plane. To disambiguate
spurious associations, we employ a hybrid approach that utilises the matching results
on the image plane and ground plane. The trajectory segments after association
are fused by adaptive averaging. Trajectory linkage then integrates segments and generates a single trajectory of an object across the entire observed area.
Finally, for activities analysis clustering is applied on complete trajectories. Our
clustering algorithm is based on four main steps, namely the extraction of a set of
representative trajectory features, non-parametric clustering, cluster merging and
information fusion for the identification of normal and rare object motion patterns.
First we transform the trajectories into a set of feature spaces on which Meanshift
identi es the modes and the corresponding clusters. Furthermore, a merging
procedure is devised to re fine these results by combining similar adjacent clusters.
The fi nal common patterns are estimated by fusing the clustering results across all
feature spaces. Clusters corresponding to reoccurring trajectories are considered as
normal, whereas sparse trajectories are associated to abnormal and rare events.
The performance of the proposed framework is evaluated on standard data-sets
and compared with state-of-the-art techniques. Experimental results show that
the proposed framework outperforms state-of-the-art algorithms both in terms of
accuracy and robustness
Algorithms for trajectory integration in multiple views
PhDThis thesis addresses the problem of deriving a coherent and accurate localization
of moving objects from partial visual information when data are generated by cameras
placed in di erent view angles with respect to the scene. The framework is built around
applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a
geometric-based solution exploits the relationships between corresponding feature points
across views and improves accuracy in object location. Then, we improve the estimation
of objects location with geometric transformations that account for lens distortions.
Additionally, we study the integration of the partial visual information generated by each
individual sensor and their combination into one single frame of observation that considers
object association and data fusion. Our approach is fully image-based, only relies on 2D
constructs and does not require any complex computation in 3D space. We exploit the
continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally,
we work under the assumption of planar ground plane and wide baseline (i.e.
cameras' viewpoints are far apart). The main contributions are: i) the development of a
framework for distributed visual sensing that accounts for inaccuracies in the geometry
of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based
homography estimation; iii) the integration of a polynomial method for correcting inaccuracies
caused by the cameras' lens distortion; iv) a global trajectory reconstruction
algorithm that associates and integrates fragments of trajectories generated by each camera
C-blox: A Scalable and Consistent TSDF-based Dense Mapping Approach
In many applications, maintaining a consistent dense map of the environment
is key to enabling robotic platforms to perform higher level decision making.
Several works have addressed the challenge of creating precise dense 3D maps
from visual sensors providing depth information. However, during operation over
longer missions, reconstructions can easily become inconsistent due to
accumulated camera tracking error and delayed loop closure. Without explicitly
addressing the problem of map consistency, recovery from such distortions tends
to be difficult. We present a novel system for dense 3D mapping which addresses
the challenge of building consistent maps while dealing with scalability.
Central to our approach is the representation of the environment as a
collection of overlapping TSDF subvolumes. These subvolumes are localized
through feature-based camera tracking and bundle adjustment. Our main
contribution is a pipeline for identifying stable regions in the map, and to
fuse the contributing subvolumes. This approach allows us to reduce map growth
while still maintaining consistency. We demonstrate the proposed system on a
publicly available dataset and simulation engine, and demonstrate the efficacy
of the proposed approach for building consistent and scalable maps. Finally we
demonstrate our approach running in real-time on-board a lightweight MAV.Comment: 8 pages, 5 figures, conferenc
The Interstate-24 3D Dataset: a new benchmark for 3D multi-camera vehicle tracking
This work presents a novel video dataset recorded from overlapping highway
traffic cameras along an urban interstate, enabling multi-camera 3D object
tracking in a traffic monitoring context. Data is released from 3 scenes
containing video from at least 16 cameras each, totaling 57 minutes in length.
877,000 3D bounding boxes and corresponding object tracklets are fully and
accurately annotated for each camera field of view and are combined into a
spatially and temporally continuous set of vehicle trajectories for each scene.
Lastly, existing algorithms are combined to benchmark a number of 3D
multi-camera tracking pipelines on the dataset, with results indicating that
the dataset is challenging due to the difficulty of matching objects traveling
at high speeds across cameras and heavy object occlusion, potentially for
hundreds of frames, during congested traffic. This work aims to enable the
development of accurate and automatic vehicle trajectory extraction algorithms,
which will play a vital role in understanding impacts of autonomous vehicle
technologies on the safety and efficiency of traffic
Data fusion in ubiquitous networked robot systems for urban services
There is a clear trend in the use of robots
to accomplish services that can help humans. In this
paper, robots acting in urban environments are considered
for the task of person guiding. Nowadays, it is
common to have ubiquitous sensors integrated within
the buildings, such as camera networks, and wireless
communications like 3G or WiFi. Such infrastructure
can be directly used by robotic platforms. The paper
shows how combining the information from the robots
and the sensors allows tracking failures to be overcome,
by being more robust under occlusion, clutter, and
lighting changes. The paper describes the algorithms
for tracking with a set of fixed surveillance cameras
and the algorithms for position tracking using the signal
strength received by a wireless sensor network (WSN).
Moreover, an algorithm to obtain estimations on the positions of people from cameras on board robots is
described. The estimate from all these sources are then
combined using a decentralized data fusion algorithm
to provide an increase in performance. This scheme is
scalable and can handle communication latencies and
failures. We present results of the system operating in
real time on a large outdoor environment, including 22
nonoverlapping cameras, WSN, and several robots.Universidad Pablo de Olavide. Departamento de Deporte e InformáticaPostprin
Learning Higher-order Transition Models in Medium-scale Camera Networks
We present a Bayesian framework for learning higherorder transition models in video surveillance networks. Such higher-order models describe object movement between cameras in the network and have a greater predictive power for multi-camera tracking than camera adjacency alone. These models also provide inherent resilience to camera failure, filling in gaps left by single or even multiple non-adjacent camera failures. Our approach to estimating higher-order transition models relies on the accurate assignment of camera observations to the underlying trajectories of objects moving through the network. We addresses this data association problem by gathering the observations and evaluating alternative partitions of the observation set into individual object trajectories. Searching the complete partition space is intractable, so an incremental approach is taken, iteratively adding observations and pruning unlikely partitions. Partition likelihood is determined by the evaluation of a probabilistic graphical model. When the algorithm has considered all observations, the most likely (MAP) partition is taken as the true object trajectories. From these recovered trajectories, the higher-order statistics we seek can be derived and employed for tracking. The partitioning algorithm we present is parallel in nature and can be readily extended to distributed computation in medium-scale smart camera networks. 1
- …