22,032 research outputs found

    Human re-identification using siamese convolutional neural network on Nvidia Geforce RTX 2060

    Get PDF
    Human reidentification in multiple cameras with disjoint views is to match a pair of humans appearing in different cameras with non-overlapping views. Human reidentification has been extensively studied in recent years because it plays a significant role in many applications such as human tracking and video retrieval. However, human re-identification is a challenging task due to varying factors such as color, pose, viewpoint, lighting conditions, low resolution and partial occlusion. Most of the existing methods in handling human re-identification task are based on various handcrafted features and metric learning. However, hand-crafted features method requires expert knowledge and requires a lot of time to tune the features and metric learning methods are not powerful enough to exploit the nonlinear relationship of samples. The main objective of this thesis is to implement Siamese Convolutional Neural Network (SCNN) for person re-identification task in multiple cameras on the NVIDIA® GeForce RTX™ 2060 platform, including person detection. This continuous with validation of the applicability of SCNN and compare with existing techniques. In this work, global and local features of human images are extracted from SCNN. The proposed SCNN consists of two identical Convolution Neural Networks with common parameters that can automatically learn hierarchical feature representations from image pixels directly which has advantages than the hand-crafted design and metric learning method. Experiments were conducted with CUHK02 offline database with non-overlapping cameras. The proposed technique demonstrated a person re-identification using SCNN on the NVIDIA® GeForce RTX™ 2060 platform

    Crossing Generative Adversarial Networks for Cross-View Person Re-identification

    Full text link
    Person re-identification (\textit{re-id}) refers to matching pedestrians across disjoint yet non-overlapping camera views. The most effective way to match these pedestrians undertaking significant visual variations is to seek reliably invariant features that can describe the person of interest faithfully. Most of existing methods are presented in a supervised manner to produce discriminative features by relying on labeled paired images in correspondence. However, annotating pair-wise images is prohibitively expensive in labors, and thus not practical in large-scale networked cameras. Moreover, seeking comparable representations across camera views demands a flexible model to address the complex distributions of images. In this work, we study the co-occurrence statistic patterns between pairs of images, and propose to crossing Generative Adversarial Network (Cross-GAN) for learning a joint distribution for cross-image representations in a unsupervised manner. Given a pair of person images, the proposed model consists of the variational auto-encoder to encode the pair into respective latent variables, a proposed cross-view alignment to reduce the view disparity, and an adversarial layer to seek the joint distribution of latent representations. The learned latent representations are well-aligned to reflect the co-occurrence patterns of paired images. We empirically evaluate the proposed model against challenging datasets, and our results show the importance of joint invariant features in improving matching rates of person re-id with comparison to semi/unsupervised state-of-the-arts.Comment: 12 pages. arXiv admin note: text overlap with arXiv:1702.03431 by other author
    corecore