1,228 research outputs found
Understanding People Flow in Transportation Hubs
In this paper, we aim to monitor the flow of people in large public
infrastructures. We propose an unsupervised methodology to cluster people flow
patterns into the most typical and meaningful configurations. By processing 3D
images from a network of depth cameras, we build a descriptor for the flow
pattern. We define a data-irregularity measure that assesses how well each
descriptor fits a data model. This allows us to rank flow patterns from highly
distinctive (outliers) to very common ones. By discarding outliers, we obtain
more reliable key configurations (classes). Synthetic experiments show that the
proposed method is superior to standard clustering methods. We applied it in an
operational scenario during 14 days in the X-ray screening area of an
international airport. Results show that our methodology is able to
successfully summarize the representative patterns for such a long observation
period, providing relevant information for airport management. Beyond regular
flows, our method identifies a set of rare events corresponding to uncommon
activities (cleaning, special security and circulating staff).Comment: 10 pages, 19 figure, 1 tabl
Active Regression with Adaptive Huber Loss
This paper addresses the scalar regression problem through a novel solution
to exactly optimize the Huber loss in a general semi-supervised setting, which
combines multi-view learning and manifold regularization. We propose a
principled algorithm to 1) avoid computationally expensive iterative schemes
while 2) adapting the Huber loss threshold in a data-driven fashion and 3)
actively balancing the use of labelled data to remove noisy or inconsistent
annotations at the training stage. In a wide experimental evaluation, dealing
with diverse applications, we assess the superiority of our paradigm which is
able to combine robustness towards noise with both strong performance and low
computational cost
Efficient Discriminative Nonorthogonal Binary Subspace with its Application to Visual Tracking
One of the crucial problems in visual tracking is how the object is
represented. Conventional appearance-based trackers are using increasingly more
complex features in order to be robust. However, complex representations
typically not only require more computation for feature extraction, but also
make the state inference complicated. We show that with a careful feature
selection scheme, extremely simple yet discriminative features can be used for
robust object tracking. The central component of the proposed method is a
succinct and discriminative representation of the object using discriminative
non-orthogonal binary subspace (DNBS) which is spanned by Haar-like features.
The DNBS representation inherits the merits of the original NBS in that it
efficiently describes the object. It also incorporates the discriminative
information to distinguish foreground from background. However, the problem of
finding the DNBS bases from an over-complete dictionary is NP-hard. We propose
a greedy algorithm called discriminative optimized orthogonal matching pursuit
(D-OOMP) to solve this problem. An iterative formulation named iterative D-OOMP
is further developed to drastically reduce the redundant computation between
iterations and a hierarchical selection strategy is integrated for reducing the
search space of features. The proposed DNBS representation is applied to object
tracking through SSD-based template matching. We validate the effectiveness of
our method through extensive experiments on challenging videos with comparisons
against several state-of-the-art trackers and demonstrate its capability to
track objects in clutter and moving background.Comment: 15 page
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
A survey on trajectory clustering analysis
This paper comprehensively surveys the development of trajectory clustering.
Considering the critical role of trajectory data mining in modern intelligent
systems for surveillance security, abnormal behavior detection, crowd behavior
analysis, and traffic control, trajectory clustering has attracted growing
attention. Existing trajectory clustering methods can be grouped into three
categories: unsupervised, supervised and semi-supervised algorithms. In spite
of achieving a certain level of development, trajectory clustering is limited
in its success by complex conditions such as application scenarios and data
dimensions. This paper provides a holistic understanding and deep insight into
trajectory clustering, and presents a comprehensive analysis of representative
methods and promising future directions
Non-Volume Preserving-based Feature Fusion Approach to Group-Level Expression Recognition on Crowd Videos
Group-level emotion recognition (ER) is a growing research area as the
demands for assessing crowds of all sizes is becoming an interest in both the
security arena as well as social media. This work extends the earlier ER
investigations, which focused on either group-level ER on single images or
within a video, by fully investigating group-level expression recognition on
crowd videos. In this paper, we propose an effective deep feature level fusion
mechanism to model the spatial-temporal information in the crowd videos. In our
approach, the fusing process is performed on deep feature domain by a
generative probabilistic model, Non-Volume Preserving Fusion (NVPF), that
models spatial information relationship. Furthermore, we extend our proposed
spatial NVPF approach to spatial-temporal NVPF approach to learn the temporal
information between frames. In order to demonstrate the robustness and
effectiveness of each component in the proposed approach, three experiments
were conducted: (i) evaluation on AffectNet database to benchmark the proposed
EmoNet for recognizing facial expression; (ii) evaluation on EmotiW2018 to
benchmark the proposed deep feature level fusion mechanism NVPF; and, (iii)
examine the proposed TNVPF on an innovative Group-level Emotion on Crowd Videos
(GECV) dataset composed of 627 videos collected from publicly available
sources. GECV dataset is a collection of videos containing crowds of people.
Each video is labeled with emotion categories at three levels: individual
faces, group of people and the entire video frame.Comment: Under review at Patter Recognitio
Modeling of Facial Aging and Kinship: A Survey
Computational facial models that capture properties of facial cues related to
aging and kinship increasingly attract the attention of the research community,
enabling the development of reliable methods for age progression, age
estimation, age-invariant facial characterization, and kinship verification
from visual data. In this paper, we review recent advances in modeling of
facial aging and kinship. In particular, we provide an up-to date, complete
list of available annotated datasets and an in-depth analysis of geometric,
hand-crafted, and learned facial representations that are used for facial aging
and kinship characterization. Moreover, evaluation protocols and metrics are
reviewed and notable experimental results for each surveyed task are analyzed.
This survey allows us to identify challenges and discuss future research
directions for the development of robust facial models in real-world
conditions
Comparative study of motion detection methods for video surveillance systems
The objective of this study is to compare several change detection methods
for a mono static camera and identify the best method for different complex
environments and backgrounds in indoor and outdoor scenes. To this end, we used
the CDnet video dataset as a benchmark that consists of many challenging
problems, ranging from basic simple scenes to complex scenes affected by bad
weather and dynamic backgrounds. Twelve change detection methods, ranging from
simple temporal differencing to more sophisticated methods, were tested and
several performance metrics were used to precisely evaluate the results.
Because most of the considered methods have not previously been evaluated on
this recent large scale dataset, this work compares these methods to fill a
lack in the literature, and thus this evaluation joins as complementary
compared with the previous comparative evaluations. Our experimental results
show that there is no perfect method for all challenging cases, each method
performs well in certain cases and fails in others. However, this study enables
the user to identify the most suitable method for his or her needs.Comment: 69 pages, 18 figures, journal pape
From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation
In crowdsourced preference aggregation, it is often assumed that all the
annotators are subject to a common preference or social utility function which
generates their comparison behaviors in experiments. However, in reality
annotators are subject to variations due to multi-criteria, abnormal, or a
mixture of such behaviors. In this paper, we propose a parsimonious
mixed-effects model, which takes into account both the fixed effect that the
majority of annotators follows a common linear utility model, and the random
effect that some annotators might deviate from the common significantly and
exhibit strongly personalized preferences. The key algorithm in this paper
establishes a dynamic path from the social utility to individual variations,
with different levels of sparsity on personalization. The algorithm is based on
the Linearized Bregman Iterations, which leads to easy parallel implementations
to meet the need of large-scale data analysis. In this unified framework, three
kinds of random utility models are presented, including the basic linear model
with L2 loss, Bradley-Terry model, and Thurstone-Mosteller model. The validity
of these multi-level models are supported by experiments with both simulated
and real-world datasets, which shows that the parsimonious multi-level models
exhibit improvements in both interpretability and predictive precision compared
with traditional HodgeRank.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence as a regular paper. arXiv admin note: substantial text overlap
with arXiv:1607.0340
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
- …