6,685 research outputs found
Unsupervised Domain Adaptation for Multispectral Pedestrian Detection
Multimodal information (e.g., visible and thermal) can generate robust
pedestrian detections to facilitate around-the-clock computer vision
applications, such as autonomous driving and video surveillance. However, it
still remains a crucial challenge to train a reliable detector working well in
different multispectral pedestrian datasets without manual annotations. In this
paper, we propose a novel unsupervised domain adaptation framework for
multispectral pedestrian detection, by iteratively generating pseudo
annotations and updating the parameters of our designed multispectral
pedestrian detector on target domain. Pseudo annotations are generated using
the detector trained on source domain, and then updated by fixing the
parameters of detector and minimizing the cross entropy loss without
back-propagation. Training labels are generated using the pseudo annotations by
considering the characteristics of similarity and complementarity between
well-aligned visible and infrared image pairs. The parameters of detector are
updated using the generated labels by minimizing our defined multi-detection
loss function with back-propagation. The optimal parameters of detector can be
obtained after iteratively updating the pseudo annotations and parameters.
Experimental results show that our proposed unsupervised multimodal domain
adaptation method achieves significantly higher detection performance than the
approach without domain adaptation, and is competitive with the supervised
multispectral pedestrian detectors
CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network
Mobile phone data have recently become an attractive source of information
about mobility behavior. Since cell phone data can be captured in a passive way
for a large user population, they can be harnessed to collect well-sampled
mobility information. In this paper, we propose CT-Mapper, an unsupervised
algorithm that enables the mapping of mobile phone traces over a multimodal
transport network. One of the main strengths of CT-Mapper is its capability to
map noisy sparse cellular multimodal trajectories over a multilayer
transportation network where the layers have different physical properties and
not only to map trajectories associated with a single layer. Such a network is
modeled by a large multilayer graph in which the nodes correspond to
metro/train stations or road intersections and edges correspond to connections
between them. The mapping problem is modeled by an unsupervised HMM where the
observations correspond to sparse user mobile trajectories and the hidden
states to the multilayer graph nodes. The HMM is unsupervised as the transition
and emission probabilities are inferred using respectively the physical
transportation properties and the information on the spatial coverage of
antenna base stations. To evaluate CT-Mapper we collected cellular traces with
their corresponding GPS trajectories for a group of volunteer users in Paris
and vicinity (France). We show that CT-Mapper is able to accurately retrieve
the real cell phone user paths despite the sparsity of the observed trace
trajectories. Furthermore our transition probability model is up to 20% more
accurate than other naive models.Comment: Under revision in Computer Communication Journa
Self-Selective Correlation Ship Tracking Method for Smart Ocean System
In recent years, with the development of the marine industry, navigation
environment becomes more complicated. Some artificial intelligence
technologies, such as computer vision, can recognize, track and count the
sailing ships to ensure the maritime security and facilitates the management
for Smart Ocean System. Aiming at the scaling problem and boundary effect
problem of traditional correlation filtering methods, we propose a
self-selective correlation filtering method based on box regression (BRCF). The
proposed method mainly include: 1) A self-selective model with negative samples
mining method which effectively reduces the boundary effect in strengthening
the classification ability of classifier at the same time; 2) A bounding box
regression method combined with a key points matching method for the scale
prediction, leading to a fast and efficient calculation. The experimental
results show that the proposed method can effectively deal with the problem of
ship size changes and background interference. The success rates and precisions
were higher than Discriminative Scale Space Tracking (DSST) by over 8
percentage points on the marine traffic dataset of our laboratory. In terms of
processing speed, the proposed method is higher than DSST by nearly 22 Frames
Per Second (FPS)
WiDEVIEW: An UltraWideBand and Vision Dataset for Deciphering Pedestrian-Vehicle Interactions
Robust and accurate tracking and localization of road users like pedestrians
and cyclists is crucial to ensure safe and effective navigation of Autonomous
Vehicles (AVs), particularly so in urban driving scenarios with complex
vehicle-pedestrian interactions. Existing datasets that are useful to
investigate vehicle-pedestrian interactions are mostly image-centric and thus
vulnerable to vision failures. In this paper, we investigate Ultra-wideband
(UWB) as an additional modality for road users' localization to enable a better
understanding of vehicle-pedestrian interactions. We present WiDEVIEW, the
first multimodal dataset that integrates LiDAR, three RGB cameras, GPS/IMU, and
UWB sensors for capturing vehicle-pedestrian interactions in an urban
autonomous driving scenario. Ground truth image annotations are provided in the
form of 2D bounding boxes and the dataset is evaluated on standard 2D object
detection and tracking algorithms. The feasibility of UWB is evaluated for
typical traffic scenarios in both line-of-sight and non-line-of-sight
conditions using LiDAR as ground truth. We establish that UWB range data has
comparable accuracy with LiDAR with an error of 0.19 meters and reliable
anchor-tag range data for up to 40 meters in line-of-sight conditions. UWB
performance for non-line-of-sight conditions is subjective to the nature of the
obstruction (trees vs. buildings). Further, we provide a qualitative analysis
of UWB performance for scenarios susceptible to intermittent vision failures.
The dataset can be downloaded via https://github.com/unmannedlab/UWB_Dataset
Entity Recognition via Multimodal Sensor Fusion with Smart Phones
This thesis serves as an exploration that takes the sensors within a cell phone beyond the current state of recognition activities. Current state of the art sensor recognition processes tend to focus on recognizing user activity. Utilizing the same sensors available for user activity classification, this thesis validates the ability to gather data about entities separate from the user carrying the smart phone. With the ability to sense entities, the ability to recognize and classify a multitude of items, situations, and phenomena opens a new realm of possibilities for how devices perceive and react to their environment
- …