170 research outputs found
Person Re-identification with Correspondence Structure Learning
This paper addresses the problem of handling spatial misalignments due to
camera-view changes or human-pose variations in person re-identification. We
first introduce a boosting-based approach to learn a correspondence structure
which indicates the patch-wise matching probabilities between images from a
target camera pair. The learned correspondence structure can not only capture
the spatial correspondence pattern between cameras but also handle the
viewpoint or human-pose variation in individual images. We further introduce a
global-based matching process. It integrates a global matching constraint over
the learned correspondence structure to exclude cross-view misalignments during
the image patch matching process, hence achieving a more reliable matching
score between images. Experimental results on various datasets demonstrate the
effectiveness of our approach
Pose-Normalized Image Generation for Person Re-identification
Person Re-identification (re-id) faces two major challenges: the lack of
cross-view paired training data and learning discriminative identity-sensitive
and view-invariant features in the presence of large pose variations. In this
work, we address both problems by proposing a novel deep person image
generation model for synthesizing realistic person images conditional on the
pose. The model is based on a generative adversarial network (GAN) designed
specifically for pose normalization in re-id, thus termed pose-normalization
GAN (PN-GAN). With the synthesized images, we can learn a new type of deep
re-id feature free of the influence of pose variations. We show that this
feature is strong on its own and complementary to features learned with the
original images. Importantly, under the transfer learning setting, we show that
our model generalizes well to any new re-id dataset without the need for
collecting any training data for model fine-tuning. The model thus has the
potential to make re-id model truly scalable.Comment: 10 pages, 5 figure
Learning Correspondence Structures for Person Re-identification
This paper addresses the problem of handling spatial misalignments due to
camera-view changes or human-pose variations in person re-identification. We
first introduce a boosting-based approach to learn a correspondence structure
which indicates the patch-wise matching probabilities between images from a
target camera pair. The learned correspondence structure can not only capture
the spatial correspondence pattern between cameras but also handle the
viewpoint or human-pose variation in individual images. We further introduce a
global constraint-based matching process. It integrates a global matching
constraint over the learned correspondence structure to exclude cross-view
misalignments during the image patch matching process, hence achieving a more
reliable matching score between images. Finally, we also extend our approach by
introducing a multi-structure scheme, which learns a set of local
correspondence structures to capture the spatial correspondence sub-patterns
between a camera pair, so as to handle the spatial misalignments between
individual images in a more precise way. Experimental results on various
datasets demonstrate the effectiveness of our approach.Comment: IEEE Trans. Image Processing, vol. 26, no. 5, pp. 2438-2453, 2017.
The project page for this paper is available at
http://min.sjtu.edu.cn/lwydemo/personReID.htm arXiv admin note: text overlap
with arXiv:1504.0624
What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification
Matching pedestrians across disjoint camera views, known as person
re-identification (re-id), is a challenging problem that is of importance to
visual recognition and surveillance. Most existing methods exploit local
regions within spatial manipulation to perform matching in local
correspondence. However, they essentially extract \emph{fixed} representations
from pre-divided regions for each image and perform matching based on the
extracted representation subsequently. For models in this pipeline, local finer
patterns that are crucial to distinguish positive pairs from negative ones
cannot be captured, and thus making them underperformed. In this paper, we
propose a novel deep multiplicative integration gating function, which answers
the question of \emph{what-and-where to match} for effective person re-id. To
address \emph{what} to match, our deep network emphasizes common local patterns
by learning joint representations in a multiplicative way. The network
comprises two Convolutional Neural Networks (CNNs) to extract convolutional
activations, and generates relevant descriptors for pedestrian matching. This
thus, leads to flexible representations for pair-wise images. To address
\emph{where} to match, we combat the spatial misalignment by performing
spatially recurrent pooling via a four-directional recurrent neural network to
impose spatial dependency over all positions with respect to the entire image.
The proposed network is designed to be end-to-end trainable to characterize
local pairwise feature interactions in a spatially aligned manner. To
demonstrate the superiority of our method, extensive experiments are conducted
over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie
Deployable Vibration Control Systems for Lightweight Structures
The recent push towards lightweight, efficient, and innovative structural designs has brought forth a range of vibration control issues related to implementation, effectiveness, and control system design that are not fully addressed by existing strategies. In many cases, these structures are capable of withstanding day-to-day loads and only experience excessive vibrations during predictable peak-loading events such as large crowds or wind storms. At the same time, the use of lightweight material coupled with innovative construction methods has given rise to temporary structures which are designed to facilitate rapid implementation and intended for short-term applications. Both scenarios point towards a vibration control system that is suitable for immediate, short-term applications which motivates the concept of deployable autonomous control systems (DACSs).
The deployability aspect implies the control system is capable of being readily implemented on a range of structures with only minor customization to the structure or device while the autonomy aspect refers to the ability of the system to react to changes in the dynamic response and effectively control different structural modes of vibration. A prototype device, consisting of an electromagnetic mass damper (EMD) mounted on an unmanned ground vehicle (UGV) equipped with vision sensors and on-board computational hardware, is developed to study the vibration control performance and demonstrate the advantages of the DACS concept. Both numerical and experimental modelling techniques are used to identify system models for each component of the prototype device. Given the system models, the dynamic interaction between the device and underlying structure is derived theoretically and validated experimentally.
The use of an EMD and UGV introduce a number of practical challenges associated with controller design. These challenges arise due to the presence of physical operating constraints as well as uncertainty in the controller model. Three different candidate controllers, based on linear-quadratic Gaussian (LQG), model-predictive control (MPC), and robust H-infinity control theory, are formulated for the prototype device and comparatively assessed with respect to their ability
to address these challenges. The MPC framework provides a systematic approach to incorporate physical operating constraints directly in the control formulation while robust synthesis of an H-infinity controller is well suited for addressing uncertainty in both the controller and structure models.
A key property of the prototype device is the ability to reposition itself at different locations on the structure. To study the impact of this mobility on the overall control performance, a simultaneous localization and mapping (SLAM) solution is implemented for bridge structures. The SLAM solution generates a map of the structure that can later be used for autonomous navigation of the prototype device. In achieving autonomous mobility, the location of the control force can be added as an additional parameter in the controller formulation.
The overall performance of the prototype device is evaluated through a combination of numerical simulations and experimental studies. Real-time hybrid simulation (RTHS) is used extensively to study the dynamic interaction effects and evaluate the control performance of the prototype device on various structures. A full-scale modular aluminum pedestrian bridge is used to demonstrate autonomous navigation and assess the advantages of a mobile control device. The results from each study point towards DACSs as being a favourable alternative to existing control systems for immediate, short-term vibration control applications
Target eccentricity effects for defensive responses
Defensive actions involving goal-directed responses to visual stimuli presented in different parts of the viewing field commonly include movements either toward (TOWARD) or away from (AWAY) the actual stimulus. One can categorize the type of defensive movements by outcome or the level of stimulus-response (S-R) compatibility, where a congruent response corresponds to a response in the TOWARD condition and an incongruent response corresponds to a response in the AWAY condition. In an effort to better understand defensive responses, which have received less attention in the literature than offensive movements regardless of their importance in combative situations, we studied the responses of quick yaw head rotations in the TOWARD and AWAY conditions to visual stimuli presented in different parts of the viewing field. In the first experiment (chapter 2) we examined the test-retest reliability of the primary and secondary measures associated with the quick yaw head rotations. After achieving an acceptable level of reliability for most measures, we investigated the effects of S-R compatibility and target eccentricity on the primary measures of reaction time of head rotation (RT) and activity of the sternocleidomastoid muscles of the neck (premotor RT) and the secondary measures of movement time, peak velocity, head excursion and the electromechanical delay for yaw head rotations (chapter 3). We found an increase in RT and premotor RT for yaw head rotations with large increases in visual field target eccentricity and involving incongruent responses observed in the AWAY condition. In chapter 4 we examined the effects of practice in the TOWARD or AWAY condition on performances in both conditions. We observed a shorter RT and premotor RT after 6 days of practice (over 2 weeks), regardless of condition practiced or of performance. Most subjects who practiced in the TOWARD condition produced greater decreases in RT and premotor RT for the TOWARD condition and most subjects who practiced in the AWAY condition produced greater decreases in RT and premotor RT for the AWAY condition. These data also suggest faster reactions in response to stimuli in the central visual field occur with practice. These results suggest reactions will be slowest for responses to objects in the far peripheral visual field and when trying to avoid object contact. RT and premotor RT at each eccentricity and for each condition can definitely improve with practice. The present results also provide small but potential added benefits for specificity of condition training. The parallel findings for RT and premotor RT suggest that outcomes observed for quick yaw head rotation RTs were primarily due to changes in neural processing time
3D objects and scenes classification, recognition, segmentation, and reconstruction using 3D point cloud data: A review
Three-dimensional (3D) point cloud analysis has become one of the attractive
subjects in realistic imaging and machine visions due to its simplicity,
flexibility and powerful capacity of visualization. Actually, the
representation of scenes and buildings using 3D shapes and formats leveraged
many applications among which automatic driving, scenes and objects
reconstruction, etc. Nevertheless, working with this emerging type of data has
been a challenging task for objects representation, scenes recognition,
segmentation, and reconstruction. In this regard, a significant effort has
recently been devoted to developing novel strategies, using different
techniques such as deep learning models. To that end, we present in this paper
a comprehensive review of existing tasks on 3D point cloud: a well-defined
taxonomy of existing techniques is performed based on the nature of the adopted
algorithms, application scenarios, and main objectives. Various tasks performed
on 3D point could data are investigated, including objects and scenes
detection, recognition, segmentation and reconstruction. In addition, we
introduce a list of used datasets, we discuss respective evaluation metrics and
we compare the performance of existing solutions to better inform the
state-of-the-art and identify their limitations and strengths. Lastly, we
elaborate on current challenges facing the subject of technology and future
trends attracting considerable interest, which could be a starting point for
upcoming research studie
- …