54 research outputs found

    A Survey on Soft Biometrics for Human Identification

    Get PDF
    The focus has been changed to multi-biometrics due to the security demands. The ancillary information extracted from primary biometric (face and body) traits such as facial measurements, gender, color of the skin, ethnicity, and height is called soft biometrics and can be integrated to improve the speed and overall system performance of a primary biometric system (e.g., fuse face with facial marks) or to generate human semantic interpretation description (qualitative) of a person and limit the search in the whole dataset when using gender and ethnicity (e.g., old African male with blue eyes) in a fusion framework. This chapter provides a holistic survey on soft biometrics that show major works while focusing on facial soft biometrics and discusses some of the features of extraction and classification techniques that have been proposed and show their strengths and limitations

    Soft Biometric Retrieval to Describe and Identify Surveillance Images

    No full text

    Deep Learning Architectures for Heterogeneous Face Recognition

    Get PDF
    Face recognition has been one of the most challenging areas of research in biometrics and computer vision. Many face recognition algorithms are designed to address illumination and pose problems for visible face images. In recent years, there has been significant amount of research in Heterogeneous Face Recognition (HFR). The large modality gap between faces captured in different spectrum as well as lack of training data makes heterogeneous face recognition (HFR) quite a challenging problem. In this work, we present different deep learning frameworks to address the problem of matching non-visible face photos against a gallery of visible faces. Algorithms for thermal-to-visible face recognition can be categorized as cross-spectrum feature-based methods, or cross-spectrum image synthesis methods. In cross-spectrum feature-based face recognition a thermal probe is matched against a gallery of visible faces corresponding to the real-world scenario, in a feature subspace. The second category synthesizes a visible-like image from a thermal image which can then be used by any commercial visible spectrum face recognition system. These methods also beneficial in the sense that the synthesized visible face image can be directly utilized by existing face recognition systems which operate only on the visible face imagery. Therefore, using this approach one can leverage the existing commercial-off-the-shelf (COTS) and government-off-the-shelf (GOTS) solutions. In addition, the synthesized images can be used by human examiners for different purposes. There are some informative traits, such as age, gender, ethnicity, race, and hair color, which are not distinctive enough for the sake of recognition, but still can act as complementary information to other primary information, such as face and fingerprint. These traits, which are known as soft biometrics, can improve recognition algorithms while they are much cheaper and faster to acquire. They can be directly used in a unimodal system for some applications. Usually, soft biometric traits have been utilized jointly with hard biometrics (face photo) for different tasks in the sense that they are considered to be available both during the training and testing phases. In our approaches we look at this problem in a different way. We consider the case when soft biometric information does not exist during the testing phase, and our method can predict them directly in a multi-tasking paradigm. There are situations in which training data might come equipped with additional information that can be modeled as an auxiliary view of the data, and that unfortunately is not available during testing. This is the LUPI scenario. We introduce a novel framework based on deep learning techniques that leverages the auxiliary view to improve the performance of recognition system. We do so by introducing a formulation that is general, in the sense that can be used with any visual classifier. Every use of auxiliary information has been validated extensively using publicly available benchmark datasets, and several new state-of-the-art accuracy performance values have been set. Examples of application domains include visual object recognition from RGB images and from depth data, handwritten digit recognition, and gesture recognition from video. We also design a novel aggregation framework which optimizes the landmark locations directly using only one image without requiring any extra prior which leads to robust alignment given arbitrary face deformations. Three different approaches are employed to generate the manipulated faces and two of them perform the manipulation via the adversarial attacks to fool a face recognizer. This step can decouple from our framework and potentially used to enhance other landmark detectors. Aggregation of the manipulated faces in different branches of proposed method leads to robust landmark detection. Finally we focus on the generative adversarial networks which is a very powerful tool in synthesizing a visible-like images from the non-visible images. The main goal of a generative model is to approximate the true data distribution which is not known. In general, the choice for modeling the density function is challenging. Explicit models have the advantage of explicitly calculating the probability densities. There are two well-known implicit approaches, namely the Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE) which try to model the data distribution implicitly. The VAEs try to maximize the data likelihood lower bound, while a GAN performs a minimax game between two players during its optimization. GANs overlook the explicit data density characteristics which leads to undesirable quantitative evaluations and mode collapse. This causes the generator to create similar looking images with poor diversity of samples. In the last chapter of thesis, we focus to address this issue in GANs framework

    κ΄‘μ—­ 닀쀑 λ³΄ν–‰μž 좔적을 μœ„ν•œ 계측적 ꢀ적 맀칭 기법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·컴퓨터곡학뢀, 2020. 8. μ΅œμ§„μ˜.The purpose of wide-area tracking problem is to track pedestrians that appear on cameras that overlap or do not overlap, regardless of the time interval or person density. In a single camera tracking, data association using overlapping of the detection boxes is used to solve the tracking problem, but still has appearance ambiguity issues. However, wide-area tracking requires a tracking scheme that focuses on the appearance similarity of humans, without the use of overlapping of detection boxes. In this dissertation, we propose the tracking scheme for the Wide-area Multi-Pedestrian Tracking (WaMuPeT). To achieve the WaMuPeT, we propose the trajectory matching in overlapping camera settings (Ch. 3), non-overlapping camera settings (Ch. 4) and robust trajectory matching in dense scene settings (Ch. 5). In trajectory matching in overlapping camera settings (Ch. 3), we propose a novel deep-learning architecture for accurate 3-D localization and tracking of a pedestrian using multiple cameras. The deep-learning network is composed of two networks: detection network and localization network. The detection network yields the pedestrian detections and the localization network estimates the ground position of a pedestrian within its detection box. In addition, an attentional pass filter is introduced to effectively connect the two networks. Using the detection proposals and their 2-D grounding positions obtained from the two networks, multi-camera multi-target 3-D localization and tracking algorithm is developed through min-cost network flow approach. In the experiments, it is shown that the proposed method improves the performance of 3-D localization and tracking. In trajectory matching in non-overlapping camera settings (Ch. 4), we propose a novel re-ranking method using a ranking-reflected metric to measure the similarity between two ordered sets of KK-nearest neighbors (OKNN). The proposed metric for ranking-reflected similarity (RSS) reflects the ranking of the shared elements between the two OKNNs. Using RSS, a re-ranking procedure is proposed that prioritizes galleries having neighbors similar to a probe's neighbor in the perspective of ranking order. In the experiment, we show that the proposed method improves the Re-ID accuracy by add-on to the state-of-the-art methods. In robust trajectory matching in dense scene settings (Ch. 5), we propose a novel framework for multi-pedestrian tracking to generate robust trajectories in dense scene. In the proposed tracking method, we propose the tracking method based on the trajectory matching by the strategy of divide and conquer method. In this strategy, short-term, mid-term and long-term trajectories are generated by each trajectory merging stages, respectively. Also we propose a novel deep-feature matching method called stable boundary selection (SBS). In SBS matching, the detections are clustered by the group similarity of deep features, so that robust trajectories can be generated. With the smoothing algorithms and the detection restoration algorithm, the proposed tracking method shows the state-of-the-art tracking accuracy in three public tracking dataset.κ΄‘μ—­ 좔적 문제의 λͺ©μ μ€ μ‹œκ°„ κ°„κ²©μ΄λ‚˜ μ‚¬λžŒ 밀도에 관계없이 κ²ΉμΉ˜κ±°λ‚˜ κ²ΉμΉ˜μ§€ μ•ŠλŠ” 카메라에 λ‚˜νƒ€λ‚˜λŠ” λ³΄ν–‰μžλ₯Ό μΆ”μ ν•˜λŠ” 것이닀. 단일 카메라 μΆ”μ μ—μ„œ 감지 μƒμžμ˜ 겹침을 μ‚¬μš©ν•˜λŠ” 데이터 연결은 좔적 문제λ₯Ό ν•΄κ²°ν•˜λŠ” 데 μ‚¬μš©λ˜μ§€λ§Œ μ—¬μ „νžˆ λͺ¨μ–‘ λͺ¨ν˜Έμ„± λ¬Έμ œκ°€ μžˆλ‹€. κ·ΈλŸ¬λ‚˜ κ΄‘μ—­ μΆ”μ μ—λŠ” 감지 μƒμžμ˜ 겹침을 μ‚¬μš©ν•˜μ§€ μ•Šκ³  μ‚¬λžŒμ˜ μ™Έν˜• μœ μ‚¬μ„±μ— 쀑점을 λ‘” 좔적 체계가 ν•„μš”ν•˜λ‹€. 이 λ…Όλ¬Έμ—μ„œλŠ” κ΄‘μ—­ 닀쀑 λ³΄ν–‰μž 좔적 (WaMuPeT)에 λŒ€ν•œ 좔적 체계λ₯Ό μ œμ•ˆν•œλ‹€. WaMuPeTλ₯Ό λ‹¬μ„±ν•˜κΈ° μœ„ν•΄ κ²ΉμΉ˜λŠ” 카메라 μ„€μ • (3 μž₯), κ²ΉμΉ˜μ§€ μ•ŠλŠ” 카메라 μ„€μ • (4 μž₯) μ—μ„œμ˜ ꢀ적 일치 그리고 λΉ½λΉ½ν•œ μž₯λ©΄ μ„€μ • (5 μž₯)μ—μ„œ κ°•μΈν•œ ꢀ적 μΌμΉ˜μ— λŒ€ν•΄μ„œ μ œμ•ˆν•œλ‹€. κ²ΉμΉ˜λŠ” 카메라 μ„€μ •μ—μ„œμ˜ ꢀ적 맀칭 (3 μž₯)μ—μ„œλŠ” μ—¬λŸ¬ 카메라λ₯Ό μ‚¬μš©ν•˜μ—¬ λ³΄ν–‰μžλ₯Ό μ •ν™•ν•˜κ²Œ 3D μ§€μ—­ν™”ν•˜κ³  μΆ”μ ν•˜κΈ°μœ„ν•œ μƒˆλ‘œμš΄ λ”₯ λŸ¬λ‹ μ•„ν‚€ν…μ²˜λ₯Ό μ œμ•ˆν•œλ‹€. λ”₯ λŸ¬λ‹ λ„€νŠΈμ›Œν¬λŠ” 감지 λ„€νŠΈμ›Œν¬μ™€ λ‘œμ»¬λΌμ΄μ œμ΄μ…˜ λ„€νŠΈμ›Œν¬μ˜ 두 가지 λ„€νŠΈμ›Œν¬λ‘œ κ΅¬μ„±λœλ‹€. 탐지 λ„€νŠΈμ›Œν¬λŠ” λ³΄ν–‰μž 탐지λ₯Ό μ œκ³΅ν•˜κ³  ν˜„μ§€ν™” λ„€νŠΈμ›Œν¬λŠ” 탐지 μƒμž λ‚΄μ—μ„œ λ³΄ν–‰μžμ˜ 지상 μœ„μΉ˜λ₯Ό μΆ”μ •ν•œλ‹€. λ˜ν•œ 두 개의 λ„€νŠΈμ›Œν¬λ₯Ό 효과적으둜 μ—°κ²°ν•˜κΈ° μœ„ν•΄μ£Όμ˜ 패슀 ν•„ν„°κ°€ λ„μž…λ˜μ—ˆλ‹€. 두 λ„€νŠΈμ›Œν¬μ—μ„œ 얻은 탐지 μ œμ•ˆ 및 2D 접지 μœ„μΉ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ μ΅œμ†Œ λΉ„μš©μ˜ λ„€νŠΈμ›Œν¬ 흐름 μ ‘κ·Ό 방식을 톡해 닀쀑 카메라 닀쀑 λŒ€μƒ 3D 지역화 및 좔적 μ•Œκ³ λ¦¬μ¦˜μ΄ κ°œλ°œλœλ‹€. μ‹€ν—˜μ—μ„œ μ œμ•ˆ 된 방법이 3D 지역화 및 좔적 μ„±λŠ₯을 ν–₯μƒμ‹œν‚€λŠ” κ²ƒμœΌλ‘œ λ‚˜νƒ€λ‚¬λ‹€. κ²ΉμΉ˜μ§€ μ•ŠλŠ” 카메라 μ„€μ •μ—μ„œμ˜ ꢀ적 일치 (4 μž₯)μ—μ„œ, μš°λ¦¬λŠ” μˆœμœ„κ°€ 반영된 λ©”νŠΈλ¦­μ„ μ‚¬μš©ν•˜μ—¬ λ‘κ°œμ˜ μˆœμ„œκ°€ μ§€μ •λœ KK-졜근 μ ‘ 이웃 (OKNN) μ„ΈνŠΈ μ‚¬μ΄μ˜ μœ μ‚¬μ„±μ„ μΈ‘μ •ν•œλ‹€. μˆœμœ„ 반영 μœ μ‚¬μ„± (RSS)에 λŒ€ν•΄ μ œμ•ˆ 된 λ©”νŠΈλ¦­μ€ 두 OKNN μ‚¬μ΄μ˜ 곡유 μš”μ†Œμ˜ μˆœμœ„λ₯Ό λ°˜μ˜ν•©λ‹ˆλ‹€. RSSλ₯Ό μ‚¬μš©ν•˜μ—¬, μˆœμœ„ μˆœμ„œμ˜ κ΄€μ μ—μ„œ ν”„λ‘œλΈŒμ˜ 이웃과 μœ μ‚¬ν•œ 이웃을 κ°–λŠ” 가러리λ₯Ό μš°μ„  μˆœμœ„ ν™”ν•˜λŠ” μž¬μˆœμœ„ μ ˆμ°¨κ°€ μ œμ•ˆλœλ‹€. μ‹€ν—˜μ—μ„œ μ œμ•ˆ 된 방법이 μ΅œμ‹  방법에 μΆ”κ°€λ˜μ–΄ Re-ID 정확도가 ν–₯상됨을 보여쀀닀. 고밀도 μž₯λ©΄ μ„€μ •μ—μ„œ κ°•λ ₯ν•œ ꢀ적 일치 (5 μž₯)μ—μ„œ, μš°λ¦¬λŠ” 고밀도 μž₯λ©΄μ—μ„œ κ°•λ ₯ν•œ ꢀ적을 μƒμ„±ν•˜κΈ° μœ„ν•΄ 닀쀑 λ³΄ν–‰μž 좔적을 μœ„ν•œ μƒˆλ‘œμš΄ ν”„λ ˆμž„ μ›Œν¬λ₯Ό μ œμ•ˆν•œλ‹€. μ œμ•ˆλœ 좔적 λ°©λ²•μ—μ„œλŠ” λΆ„ν•  및 정볡 방법 μ „λž΅μ— λ”°λ₯Έ ꢀ적 맀칭을 기반으둜 좔적 방법을 μ œμ•ˆν•œλ‹€. 이 μ „λž΅μ—μ„œ, 단기, 쀑기 및 μž₯κΈ° ꢀ적은 각각의 ꢀ적 병합 단계에 μ˜ν•΄ μƒμ„±λœλ‹€. λ˜ν•œ SBS (Stable Boundary Selection)λΌλŠ” μƒˆλ‘œμš΄ κΈ°λŠ₯ 맀칭 기법을 μ œμ•ˆν•œλ‹€. SBS λ§€μΉ­μ—μ„œ, νƒμ§€λŠ” κΉŠμ€ νŠΉμ§•μ˜ κ·Έλ£Ή μœ μ‚¬μ„±μ— μ˜ν•΄ κ΅°μ§‘ν™”λ˜μ–΄, κ°•λ ₯ν•œ ꢀ적이 생성 될 수 μžˆλ‹€. μ œμ•ˆ 된 좔적 방법은 ν‰ν™œ μ•Œκ³ λ¦¬μ¦˜κ³Ό 탐지 볡원 μ•Œκ³ λ¦¬μ¦˜μ„ 톡해 3 개의 곡개 좔적 데이터 μ„ΈνŠΈμ—μ„œ μ΅œμ²¨λ‹¨ 좔적 정확도λ₯Ό 보여쀀닀.Chapter 1 Introduction 1 1.1 Background 1 1.2 Related Works 4 1.2.1 Localization of Pedestrian Detection 4 1.2.2 Pedestrian Feature from Person Re-identification 5 1.2.3 Multi-Pedestrian Tracking 8 1.3 Contributions 8 1.4 Thesis Organization 10 Chapter 2 Problem Statements 11 2.1 Trajectory Matching in Overlapping Camera Settings 11 2.1.1 Challenges 11 2.1.2 Approach for the challenges 13 2.2 Trajectory Matching in Non-Overlapping Camera Settings 13 2.2.1 Challenges 13 2.2.2 Approach for the challenges 14 2.3 Robust Trajectory Matching in Dense Scene Settings 16 2.3.1 Challenges 16 2.3.2 Approach for the challenges 18 Chapter 3 Trajectory Matching in Overlapping Camera Settings 19 3.1 Overall Scheme 19 3.2 Network Design 20 3.3 MCMTT with Proposed Network 22 Chapter 4 Trajectory Matching in Non-overlapping Camera Settings 25 4.1 Overall Scheme 25 4.2 Proposed Method 30 4.2.1 Proposed Similarity Metric 30 4.2.2 Selection of A 31 4.2.3 Re-ranking Procedure 32 Chapter 5 Robust Trajectory Matching in Dense Scene Settings 35 5.1 Overall Scheme 35 5.2 Similarity Matrix Generation 39 5.3 Stable Boundary Selection 40 5.4 Trajectory Smoothing 42 5.5 Detection Restoration 46 5.6 Trajectory Merging Process 48 Chapter 6 Experiments 51 6.1 Dataset and Evaluation Metric 51 6.1.1 Trajectory Matching in Overlapping Camera Settings 51 6.1.2 Trajectory Matching in Non-overlapping Camera Settings 52 6.1.3 Robust Trajectory Matching in Dense Scene Settings 53 6.2 Results and Discussion 56 6.2.1 Trajectory Matching in Overlapping Camera Settings 56 6.2.2 Trajectory Matching in Non-overlapping Camera Settings 56 6.2.3 Robust Trajectory Matching in Dense Scene Settings 62 Chapter 7 Conclusions and Future Works 81 7.1 Concluding Remarks 81 7.2 Future Works 83 Abstract 97Docto

    Person Re-identification: Past, Present and Future

    Full text link
    Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues
    • …
    corecore