Abstract. Visual identification of an individual in a crowded environ-ment observed by a distributed camera network is critical to a variety of tasks including commercial space management, border control, and crime prevention. Automatic re-identification of a human from public space CCTV video is challenging due to spatiotemporal visual feature varia-tions and strong visual similarity in people’s appearance, compounded by low-resolution and poor quality video data. Relying on re-identification using a probe image is limiting, as a linguistic description of an individ-ual’s profile may often be the only available cues. In this work, we show how mid-level semantic attributes can be used synergistically with low-level features for both identification and re-identification. Specifically, we learn an attribute-centric representation to describe people, and a met-ric for comparing attribute profiles to disambiguate individuals. This differs from existing approaches to re-identification which rely purely on bottom-up statistics of low-level features: it allows improved robustness to view and lighting; and can be used for identification as well as re-identification. Experiments demonstrate the flexibility and effectiveness of our approach compared to existing feature representations when ap-plied to benchmark datasets.
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.