Search CORE

11,939 research outputs found

Deep 3D human pose estimation: A review

Author: He Z.
Shao L.
Tan S.
Wang J.
Xu S.
Zhen X.
Zheng F.
Publication venue: 'Elsevier BV'
Publication date: 01/09/2021
Field of study

Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Author: Li Huafeng
Li Jinxing
Li Shuang
Lin Xinyu
Lu Guangming
Ma Zeyu
Xu Kaixiong
Zhang David
Publication venue
Publication date: 04/08/2022
Field of study

Thanks for the cross-modal retrieval techniques, visible-infrared (RGB-IR) person re-identification (Re-ID) is achieved by projecting them into a common space, allowing person Re-ID in 24-hour surveillance systems. However, with respect to the probe-to-gallery, almost all existing RGB-IR based cross-modal person Re-ID methods focus on image-to-image matching, while the video-to-video matching which contains much richer spatial- and temporal-information remains under-explored. In this paper, we primarily study the video-based cross-modal person Re-ID method. To achieve this task, a video-based RGB-IR dataset is constructed, in which 927 valid identities with 463,259 frames and 21,863 tracklets captured by 12 RGB/IR cameras are collected. Based on our constructed dataset, we prove that with the increase of frames in a tracklet, the performance does meet more enhancement, demonstrating the significance of video-to-video matching in RGB-IR person Re-ID. Additionally, a novel method is further proposed, which not only projects two modalities to a modal-invariant subspace, but also extracts the temporal-memory for motion-invariant. Thanks to these two strategies, much better results are achieved on our video-based cross-modal person Re-ID. The code and dataset are released at: https://github.com/VCMproject233/MITML

arXiv.org e-Print Archive

Automatic Synchronization of Multi-User Photo Galleries

Author: Apostolidis K.
Boato G.
Conci N.
De Natale F. G. B.
Mezaris V.
Sansone E.
Publication venue
Publication date: 16/01/2017
Field of study

In this paper we address the issue of photo galleries synchronization, where pictures related to the same event are collected by different users. Existing solutions to address the problem are usually based on unrealistic assumptions, like time consistency across photo galleries, and often heavily rely on heuristics, limiting therefore the applicability to real-world scenarios. We propose a solution that achieves better generalization performance for the synchronization task compared to the available literature. The method is characterized by three stages: at first, deep convolutional neural network features are used to assess the visual similarity among the photos; then, pairs of similar photos are detected across different galleries and used to construct a graph; eventually, a probabilistic graphical model is used to estimate the temporal offset of each pair of galleries, by traversing the minimum spanning tree extracted from this graph. The experimental evaluation is conducted on four publicly available datasets covering different types of events, demonstrating the strength of our proposed method. A thorough discussion of the obtained results is provided for a critical assessment of the quality in synchronization.Comment: ACCEPTED to IEEE Transactions on Multimedi

arXiv.org e-Print Archive

ZENODO