157,983 research outputs found

    Human Re-Identification in Multi-Camera Systems

    Get PDF
    This research involves live human re-identification on multi-camera systems. Each frame of multiple cameras needs to be captured and analyzed with image processing methods. First, a histogram of oriented gradients (HOG) is performed to identify a person in each frame. Next, Local Binary Pattern (LBP) descriptors are used on each person to determine certain set features about then. Lastly, a red, green, blue (RGB) color histogram is performed on a specific body mask. Each body is then given a label based on their LBP and color histogram information and that label will be sent to a database. This label should be the same across all the cameras. The process should also happen live. The research will include analysis of the difference between using a static body mask and using pose estimation for a more accurate color histogram. Also, regional descriptors will be used to better describe the human body. Lastly, the difference between YCrCb and RGB color histograms will be shown.https://ecommons.udayton.edu/stander_posters/1460/thumbnail.jp

    Agent-based framework for person re-identification

    Get PDF
    In computer based human object re-identification, a detected human is recognised to a level sufficient to re-identify a tracked person in either a different camera capturing the same individual, often at a different angle, or the same camera at a different time and/or the person approaching the camera at a different angle. Instead of relying on face recognition technology such systems study the clothing of the individuals being monitored and/or objects being carried to establish correspondence and hence re-identify the human object. Unfortunately present human-object re-identification systems consider the entire human object as one connected region in making the decisions about similarity of two objects being matched. This assumption has a major drawback in that when a person is partially occluded, a part of the occluding foreground will be picked up and used in matching. Our research revealed that when a human observer carries out a manual human-object re-identification task, the attention is often taken over by some parts of the human figure/body, more than the others, e.g. face, brightly colour shirt, presence of texture patterns in clothing etc., and occluding parts are ignored. In this thesis, a novel multi-agent based framework is proposed for the design of a human object re-identification system. Initially a HOG based feature extraction is used in a SVM based classification of a human object as a human of a full-body or of half body nature. Subsequently the relative visual significance of the top and the bottom parts of the human, in re-identification is quantified by the analysis of Gray Level Co-occurrence based texture features and colour histograms obtained in the HSV colour space. Accordingly different weights are assigned to the top and bottom of the human body using a novel probabilistic approach. The weights are then used to modify the Hybrid Spatiogram and Covariance Descriptor (HSCD) feature based re-identification algorithm adopted. A significant novelty of the human object re-identification systems proposed in this thesis is the agent based design procedure adopted that separates the use of computer vision algorithms for feature extraction, comparison etc., from the decision making process of re-identification. Multiple agents are assigned to execute different algorithmic tasks and the agents communicate to make the required logical decisions. Detailed experimental results are provided to prove that the proposed multi agent based framework for human object re-identification performs significantly better than the state of-the-art algorithms. Further it is shown that the design flexibilities and scalabilities of the proposed system allows it to be effectively utilised in more complex computer vision based video analytic/forensic tasks often conducted within distributed, multi-camera systems

    Learning Discriminative Features for Person Re-Identification

    Get PDF
    For fulfilling the requirements of public safety in modern cities, more and more large-scale surveillance camera systems are deployed, resulting in an enormous amount of visual data. Automatically processing and interpreting these data promote the development and application of visual data analytic technologies. As one of the important research topics in surveillance systems, person re-identification (re-id) aims at retrieving the target person across non-overlapping camera-views that are implemented in a number of distributed space-time locations. It is a fundamental problem for many practical surveillance applications, eg, person search, cross-camera tracking, multi-camera human behavior analysis and prediction, and it received considerable attentions nowadays from both academic and industrial domains. Learning discriminative feature representation is an essential task in person re-id. Although many methodologies have been proposed, discriminative re-id feature extraction is still a challenging problem due to: (1) Intra- and inter-personal variations. The intrinsic properties of the camera deployment in surveillance system lead to various changes in person poses, view-points, illumination conditions etc. This may result in the large intra-personal variations and/or small inter-personal variations, thus incurring problems in matching person images. (2) Domain variations. The domain variations between different datasets give rise to the problem of generalization capability of re-id model. Directly applying a re-id model trained on one dataset to another one usually causes a large performance degradation. (3) Difficulties in data creation and annotation. Existing person re-id methods, especially deep re-id methods, rely mostly on a large set of inter-camera identity labelled training data, requiring a tedious data collection and annotation process. This leads to poor scalability in practical person re-id applications. Corresponding to the challenges in learning discriminative re-id features, this thesis contributes to the re-id domain by proposing three related methodologies and one new re-id setting: (1) Gaussian mixture importance estimation. Handcrafted features are usually not discriminative enough for person re-id because of noisy information, such as background clutters. To precisely evaluate the similarities between person images, the main task of distance metric learning is to filter out the noisy information. Keep It Simple and Straightforward MEtric (KISSME) is an effective method in person re-id. However, it is sensitive to the feature dimensionality and cannot capture the multi-modes in dataset. To this end, a Gaussian Mixture Importance Estimation re-id approach is proposed, which exploits the Gaussian Mixture Models for estimating the observed commonalities of similar and dissimilar person pairs in the feature space. (2) Unsupervised domain-adaptive person re-id based on pedestrian attributes. In person re-id, person identities are usually not overlapped among different domains (or datasets) and this raises the difficulties in generalizing re-id models. Different from person identity, pedestrian attributes, eg., hair length, clothes type and color, are consistent across different domains (or datasets). However, most of re-id datasets lack attribute annotations. On the other hand, in the field of pedestrian attribute recognition, there is a number of datasets labeled with attributes. Exploiting such data for re-id purpose can alleviate the shortage of attribute annotations in re-id domain and improve the generalization capability of re-id model. To this end, an unsupervised domain-adaptive re-id feature learning framework is proposed to make full use of attribute annotations. Specifically, an existing unsupervised domain adaptation method has been extended to transfer attribute-based features from attribute recognition domain to the re-id domain. With the proposed re-id feature learning framework, the domain invariant feature representations can be effectively extracted. (3) Intra-camera supervised person re-id. Annotating the large-scale re-id datasets requires a tedious data collection and annotation process and therefore leads to poor scalability in practical person re-id applications. To overcome this fundamental limitation, a new person re-id setting is considered without inter-camera identity association but only with identity labels independently annotated within each camera-view. This eliminates the most time-consuming and tedious inter-camera identity association annotating process and thus significantly reduces the amount of human efforts required during annotation. It hence gives rise to a more scalable and more feasible learning scenario, which is named as Intra-Camera Supervised (ICS) person re-id. Under this ICS setting, a new re-id method, i.e., Multi-task Mulit-label (MATE) learning method, is formulated. Given no inter-camera association, MATE is specially designed for self-discovering the inter-camera identity correspondence. This is achieved by inter-camera multi-label learning under a joint multi-task inference framework. In addition, MATE can also efficiently learn the discriminative re-id feature representations using the available identity labels within each camera-view

    Using latent features for short-term person re-identification with RGB-D cameras

    Full text link
    This paper presents a system for people re-identification in uncontrolled scenarios using RGB-depth cameras. Compared to conventional RGB cameras, the use of depth information greatly simplifies the tasks of segmentation and tracking. In a previous work, we proposed a similar architecture where people were characterized using color-based descriptors that we named bodyprints. In this work, we propose the use of latent feature models to extract more relevant information from the bodyprint descriptors by reducing their dimensionality. Latent features can also cope with missing data in case of occlusions. Different probabilistic latent feature models, such as probabilistic principal component analysis and factor analysis, are compared in the paper. The main difference between the models is how the observation noise is handled in each case. Re-identification experiments have been conducted in a real store where people behaved naturally. The results show that the use of the latent features significantly improves the re-identification rates compared to state-of-the-art works.The work presented in this paper has been funded by the Spanish Ministry of Science and Technology under the CICYT contract TEVISMART, TEC2009-09146.Oliver Moll, J.; Albiol Colomer, A.; Albiol Colomer, AJ.; Mossi García, JM. (2016). Using latent features for short-term person re-identification with RGB-D cameras. Pattern Analysis and Applications. 19(2):549-561. https://doi.org/10.1007/s10044-015-0489-8S549561192http://kinectforwindows.org/http://www.gpiv.upv.es/videoresearch/personindexing.htmlAlbiol A, Albiol A, Oliver J, Mossi JM (2012) Who is who at different cameras. Matching people using depth cameras. Comput Vis IET 6(5):378–387Bak S, Corvee E, Bremond F, Thonnat M (2010) Person re-identification using haar-based and dcd-based signature. In: 2nd workshop on activity monitoring by multi-camera surveillance systems, AMMCSS 2010, in conjunction with 7th IEEE international conference on advanced video and signal-based surveillance, AVSS. AVSSBak S, Corvee E, Bremond F, Thonnat M (2010) Person re-identification using spatial covariance regions of human body parts. In: Seventh IEEE international conference on advanced video and signal based surveillance. pp. 435–440Bak S, Corvee E, Bremond F, Thonnat M (2011) Multiple-shot human re-identification by mean riemannian covariance grid. In: Advanced video and signal-based surveillance. Klagenfurt, Autriche. http://hal.inria.fr/inria-00620496Baltieri D, Vezzani R, Cucchiara R, Utasi A, BenedeK C, Szirányi T (2011) Multi-view people surveillance using 3d information. In: ICCV workshops. pp. 1817–1824Barbosa BI, Cristani M, Del Bue A, Bazzani L, Murino V (2012) Re-identification with rgb-d sensors. In: First international workshop on re-identificationBasilevsky A (1994) Statistical factor analysis and related methods: theory and applications. Willey, New YorkBäuml M, Bernardin K, Fischer k, Ekenel HK, Stiefelhagen R (2010) Multi-pose face recognition for person retrieval in camera networks. In: International conference on advanced video and signal-based surveillanceBazzani L, Cristani M, Perina A, Farenzena M, Murino V (2010) Multiple-shot person re-identification by hpe signature. In: Proceedings of the 2010 20th international conference on pattern recognition. Washington, DC, USA, pp. 1413–1416Bird ND, Masoud O, Papanikolopoulos NP, Isaacs A (2005) Detection of loitering individuals in public transportation areas. IEEE Trans Intell Transp Syst 6(2):167–177Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, SecaucusCha SH (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 1(4):300–307Cheng YM, Zhou WT, Wang Y, Zhao CH, Zhang SW (2009) Multi-camera-based object handoff using decision-level fusion. In: Conference on image and signal processing. pp. 1–5Dikmen M, Akbas E, Huang TS, Ahuja N (2010) Pedestrian recognition with a learned metric. In: Asian conference in computer visionDoretto G, Sebastian T, Tu P, Rittscher J (2011) Appearance-based person reidentification in camera networks: problem overview and current approaches. J Ambient Intell Humaniz Comput 2:1–25Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the 2010 IEEE computer society conference on computer vision and pattern recognition (CVPR 2010). IEEE Computer Society, San Francisco, CA, USAFodor I (2002) A survey of dimension reduction techniques. Technical report. Lawrence Livermore National LaboratoryFreund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969Gandhi T, Trivedi M (2006) Panoramic appearance map (pam) for multi-camera based person re-identification. Advanced Video and Signal Based Surveillance, IEEE Conference on, p. 78Garcia J, Gardel A, Bravo I, Lazaro J (2014) Multiple view oriented matching algorithm for people reidentification. Ind Inform IEEE Trans 10(3):1841–1851Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. CVPR 2:1528–1535Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking. In: Proceedings of IEEE international workshop on performance evaluation for tracking and surveillance (PETS)Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proceedings of the 10th european conference on computer vision: part I. Berlin, pp. 262–275 (2008)Ilin A, Raiko T (2010) Practical approaches to principal component analysis in the presence of missing values. J Mach Learn Res 99:1957–2000Javed O, Shafique O, Rasheed Z, Shah M (2008) Modeling inter-camera space–time and appearance relationships for tracking across non-overlapping views. Comput Vis Image Underst 109(2):146–162Kai J, Bodensteiner C, Arens M (2011) Person re-identification in multi-camera networks. In: Computer vision and pattern recognition workshops (CVPRW), 2011 IEEE computer society conference on, pp. 55–61Kuo CH, Huang C, Nevatia R (2010) Inter-camera association of multi-target tracks by on-line learned appearance affinity models. Proceedings of the 11th european conference on computer vision: part I, ECCV’10. Springer, Berlin, pp 383–396Lan R, Zhou Y, Tang YY, Chen C (2014) Person reidentification using quaternionic local binary pattern. In: Multimedia and expo (ICME), 2014 IEEE international conference on, pp. 1–6Loy CC, Liu C, Gong S (2013) Person re-identification by manifold ranking. In: icip. pp. 3318–3325Madden C, Cheng E, Piccardi M (2007) Tracking people across disjoint camera views by an illumination-tolerant appearance representation. Mach Vis Appl 18:233–247Mazzon R, Tahir SF, Cavallaro A (2012) Person re-identification in crowd. Pattern Recogn Lett 33(14):1828–1837Oliveira IO, Souza Pio JL (2009) People reidentification in a camera network. In: Eighth IEEE international conference on dependable, autonomic and secure computing. pp. 461–466Papadakis P, Pratikakis I, Theoharis T, Perantonis SJ (2010) Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int J Comput Vis 89(2–3):177–192Prosser B, Zheng WS, Gong S, Xiang T (2010) Person re-identification by support vector ranking. In: Proceedings of the British machine vision conference. BMVA Press, pp. 21.1–21.11Roweis S (1998) Em algorithms for pca and spca. In: Advances in neural information processing systems. MIT Press, Cambridge, pp. 626–632 (1998)Pedagadi S, Orwell J, Velastin S, Boghossian B (2013) Local fisher discriminant analysis for pedestrian re-identification. In: CVPR. pp. 3318–3325Satta R, Fumera G, Roli F (2012) Fast person re-identification based on dissimilarity representations. Pattern Recogn Lett, Special Issue on Novel Pattern Recognition-Based Methods for Reidentification in Biometric Context 33:1838–1848Tao D, Jin L, Wang Y, Li X (2015) Person reidentification by minimum classification error-based kiss metric learning. Cybern IEEE Trans 45(2):242–252Tipping ME, Bishop CM (1999) Probabilistic principal component analysis. J R Stat Soc Ser B 61:611–622Tisse CL, Martin L, Torres L, Robert M (2002) Person identification technique using human iris recognition. In: Proceedings of vision interface, pp 294–299Vandergheynst P, Bierlaire M, Kunt M, Alahi A (2009) Cascade of descriptors to detect and track objects across any network of cameras. Comput Vis Image Underst, pp 1413–1416Verbeek J (2009) Notes on probabilistic pca with missing values. Technical reportWang D, Chen CO, Chen TY, Lee CT (2009) People recognition for entering and leaving a video surveillance area. In: Fourth international conference on innovative computing, information and control. pp. 334–337Zhang Z, Troje NF (2005) View-independent person identification from human gait. Neurocomputing 69:250–256Zhao T, Aggarwal M, Kumar R, Sawhney H (2005) Real-time wide area multi-camera stereo tracking. In: IEEE computer society conference on computer vision and pattern recognition. pp. 976–983Zheng S, Xie B, Huang K, Tao D (2011) Multi-view pedestrian recognition using shared dictionary learning with group sparsity. In: Lu BL, Zhang L, Kwok JT (eds) ICONIP (3), Lecture notes in computer science, vol 7064. Springer, New York, pp. 629–638Zheng WS, Gong S, Xiang T (2011) Person re-identification by probabilistic relative distance comparison. In: Computer vision and pattern recognition (CVPR), 2011 IEEE conference on. pp. 649–65

    Scalable deep feature learning for person re-identification

    Get PDF
    Person Re-identification (Person Re-ID) is one of the fundamental and critical tasks of the video surveillance systems. Given a probe image of a person obtained from one Closed Circuit Television (CCTV) camera, the objective of Person Re-ID is to identify the same person from a large gallery set of images captured by other cameras within the surveillance system. By successfully associating all the pedestrians, we can quickly search, track and even plot a movement trajectory of any person of interest within a CCTV system. Currently, most search and re-identification jobs are still processed manually by police or security officers. It is desirable to automate this process in order to reduce an enormous amount of human labour and increase the pedestrian tracking and retrieval speed. However, Person Re-ID is a challenging problem because of so many uncontrolled properties of a multi-camera surveillance system: cluttered backgrounds, large illumination variations, different human poses and different camera viewing angles. The main goal of this thesis is to develop deep learning based person reidentification models for real-world deployment in surveillance system. This thesis focuses on learning and extracting robust feature representations of pedestrians. In this thesis, we first proposed two supervised deep neural network architectures. One end-to-end Siamese network is developed for real-time person matching tasks. It focuses on extracting the correspondence feature between two images. For an offline person retrieval application, we follow the commonly used feature extraction with distance metric two-stage pipline and propose a strong feature embedding extraction network. In addition, we surveyed many valuable training techniques proposed recently in the literature to integrate them with our newly proposed NP-Triplet xiii loss to construct a strong Person Re-ID feature extraction model. However, during the deployment of the online matching and offline retrieval system, we realise the poor scalability issue in most supervised models. A model trained from labelled images obtained from one system cannot perform well on other unseen systems. Aiming to make the Person Re-ID models more scalable for different surveillance systems, the third work of this thesis presents cross-Dataset feature transfer method (MMFA). MMFA can train and transfer the model learned from one system to another simultaneously. Our goal to create a more scalable and robust person reidentification system did not stop here. The last work of this thesis, we address the limitation of MMFA structure and proposed a multi-dataset feature generalisation approach (MMFA-AAE), which aims to learn a universal feature representation from multiple labelled datasets. Aiming to facilitate the research towards Person Re-ID applications in more realistic scenarios, a new datasets ROSE-IDENTITY-Outdoor (RE-ID-Outdoor) has been collected and annotated with the largest number of cameras and 40 mid-level attributes

    Who is who at different cameras: people re-identification using depth cameras

    Full text link
    This study proposes the concept of bodyprints to perform re-identification of people in surveillance videos. Bodyprints are obtained using calibrated depth-colour cameras such as kinect. The author's results on a database of 40 people show that bodyprints are very robust to changes of pose, point of view and illumination. Potential applications include tracking people with networks of non-overlapping cameras. © 2012 The Institution of Engineering and Technology.The work presented in this paper has been funded by the Spanish Ministry of Science and Technology under the CICYT contract TEVISMART, TEC2009-09146.Albiol Colomer, AJ.; Albiol Colomer, A.; Oliver Moll, J.; Mossi García, JM. (2012). Who is who at different cameras: people re-identification using depth cameras. IET Computer Vision. 6(5):378-387. https://doi.org/10.1049/iet-cvi.2011.0140S37838765Dee, H. M., & Velastin, S. A. (2007). How close are we to solving the problem of automated visual surveillance? Machine Vision and Applications, 19(5-6), 329-343. doi:10.1007/s00138-007-0077-zhttp://www.pointclouds.org/Zhang, Z., & Troje, N. F. (2005). View-independent person identification from human gait. Neurocomputing, 69(1-3), 250-256. doi:10.1016/j.neucom.2005.06.002Bazzani, L., Cristani, M., Perina, A., Farenzena, M., & Murino, V. (2010). Multiple-Shot Person Re-identification by HPE Signature. 2010 20th International Conference on Pattern Recognition. doi:10.1109/icpr.2010.349Doretto, G., Sebastian, T., Tu, P., & Rittscher, J. (2011). Appearance-based person reidentification in camera networks: problem overview and current approaches. Journal of Ambient Intelligence and Humanized Computing, 2(2), 127-151. doi:10.1007/s12652-010-0034-yBk, S., Corvee, E., Bremond, F., & Thonnat, M. (2010). Person Re-identification Using Spatial Covariance Regions of Human Body Parts. 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. doi:10.1109/avss.2010.34Da-Jinn Wang, Chao-Ho Chen, Tsong-Yi Chen, & Chien-Tsung Lee. (2009). People Recognition for Entering & Leaving a Video Surveillance Area. 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC). doi:10.1109/icicic.2009.293Bird, N. D., Masoud, O., Papanikolopoulos, N. P., & Isaacs, A. (2005). Detection of Loitering Individuals in Public Transportation Areas. IEEE Transactions on Intelligent Transportation Systems, 6(2), 167-177. doi:10.1109/tits.2005.848370Oliveira, I. O. de, & Pio, J. L. de S. (2009). People Reidentification in a Camera Network. 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing. doi:10.1109/dasc.2009.33Hamdoun, O., Moutarde, F., Stanciulescu, B., & Steux, B. (2008). Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences. 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras. doi:10.1109/icdsc.2008.4635689Office, U.H.: ‘i-LIDS multiple camera tracking scenario definition’, 2008)http://www.gpiv.upv.es/kinect_data/http://www.primesense.com/http://www.openni.org/http://opencv.willowgarage.com/http://www.ros.org/http://kinectforwindows.org/Grimaud, M. (1992). New measure of contrast: the dynamics. Image Algebra and Morphological Image Processing III. doi:10.1117/12.60650Beucher, S., and Meyer, F.: ‘The morphological approach to segmentation: the watershed transformation’, (Marcel-Dekker 1992), p. 433–4

    Real-world Human Re-identification: Attributes and Beyond.

    Get PDF
    PhDSurveillance systems capable of performing a diverse range of tasks that support human intelligence and analytical efforts are becoming widespread and crucial due to increasing threats upon national infrastructure and evolving business and governmental analytical requirements. Surveillance data can be critical for crime-prevention, forensic analysis, and counter-terrorism activities in both civilian and governmental agencies alike. However, visual surveillance data must currently be parsed by trained human operators and therefore any utility is offset by the inherent training and staffing costs as a result. The automated analysis of surveillance video is therefore of great scientific interest. One of the open problems within this area is that of reliably matching humans between disjoint surveillance camera views, termed re-identification. Automated re-identification facilitates human operational efficiency in the grouping of disparate and fragmented people observations through space and time into individual personal identities, a pre-requisite for higher-level surveillance tasks. However, due to the complex nature of realworld scenes and the highly variable nature of human appearance, reliably re-identifying people is non-trivial. Most re-identification approaches developed so far rely on low-level visual feature matching approaches that aim to match human detections against a known gallery of potential matches. However, for many applications an initial detection of a human may be unavailable or a low-level feature representation may not be sufficiently invariant to photometric or geometric variability inherent between camera views. This thesis begins by proposing a “mid-level” human-semantic representation that exploits expert human knowledge of surveillance task execution to the task of re-identifying people in order to compute an attribute-based description of a human. It further shows how this attribute-based description is synergistic with low-level data-derived features to enhance re-identification accuracy and subsequently gain further performance benefits by employing a discriminatively learned distance metric. Finally, a novel “zero-shot” scenario is proposed in which a visual probe is unavailable but re-identification is still possible via a manually provided semantic attribute description. The approach is extensively evaluated using several public benchmark datasets. One challenge in constructing an attribute-based and human-semantic representation is the requirement for extensive annotation. Mitigating this annotation cost in order to present a realistic and scalable re-identification system, is motivation for the second technical area of this thesis, where transfer-learning and data-mining are investigatedin two different approaches. Discriminative methods trade annotation cost for enhanced performance. Because discriminative person re-identification models operate between two camera views, annotation cost therefore scales quadratically on the number of cameras in the entire network. For practical re-identification, this 4 is an unreasonable expectation and prohibitively expensive. By leveraging flexible multi-source transfer of re-identification models, part of this cost may be alleviated. Specifically, it is possible to leverage prior re-identification models learned for a set of source-view pairs (domains), and flexibly combine those to obtain good re-identification performance for a given target-view pair with greatly reduced annotation requirements. The volume of exhaustive annotation effort required for attribute-driven re-identification scales linearly on the number of cameras and attributes. Real-world operation of an attributeenabled, distributed camera network would also require prohibitive quantities of annotation effort by human experts. This effort is completely avoided by taking a data-driven approach to attribute computation, by learning an effective associated representation by crawling large volumes of Internet data. By training on a larger and more diverse array of examples, this representation is more view-invariant and generalisable than attributes trained on conventional scales. These automatically discovered attributes are shown to provide a valuable representation that significantly improves re-identification performance. Moreover, a method to map them onto existing expert-annotated-ontologies is contributed. In the final contribution of this thesis, the underlying assumptions about visual surveillance equipment and re-identification are challenged and the thesis motivates a novel research area using dynamic, mobile platforms. Such platforms violate the common assumption shared by most previous research, namely that surveillance devices are always stationary, relative to the observed scene. The most important new challenge discovered in this exciting area is that the unconstrained video is too challenging for traditional approaches to applying discriminative methods that rely on the explicit modelling of appearance translations when modelling view-pairs, or even a single view. A new dataset was collected by a remote-operated vehicle using control software developed to simulate a fully-autonomous re-identification unmanned aerial vehicle programmed to fly in proximity with humans until images of sufficient quality for re-identification are obtained. Variations of the standard re-identification model are investigated in an enhanced re-identification paradigm, and new challenges with this distinct form of re-identification are elucidated. Finally, conventional wisdom regarding re-identification in light of these observations is re-examined

    A multi-viewpoint feature-based re-identification system driven by skeleton keypoints

    Get PDF
    Thanks to the increasing popularity of 3D sensors, robotic vision has experienced huge improvements in a wide range of applications and systems in the last years. Besides the many benefits, this migration caused some incompatibilities with those systems that cannot be based on range sensors, like intelligent video surveillance systems, since the two kinds of sensor data lead to different representations of people and objects. This work goes in the direction of bridging the gap, and presents a novel re-identification system that takes advantage of multiple video flows in order to enhance the performance of a skeletal tracking algorithm, which is in turn exploited for driving the re-identification. A new, geometry-based method for joining together the detections provided by the skeletal tracker from multiple video flows is introduced, which is capable of dealing with many people in the scene, coping with the errors introduced in each view by the skeletal tracker. Such method has a high degree of generality, and can be applied to any kind of body pose estimation algorithm. The system was tested on a public dataset for video surveillance applications, demonstrating the improvements achieved by the multi-viewpoint approach in the accuracy of both body pose estimation and re-identification. The proposed approach was also compared with a skeletal tracking system working on 3D data: the comparison assessed the good performance level of the multi-viewpoint approach. This means that the lack of the rich information provided by 3D sensors can be compensated by the availability of more than one viewpoint
    corecore