109 research outputs found

    A Non-Intrusive Multi-Sensor RGB-D System for Preschool Classroom Behavior Analysis

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2017. Major: Computer Science. Advisor: Nikolaos Papanikolopoulos. 1 computer file (PDF); vii, 121 pages + 2 mp4 video filesMental health disorders are a leading cause of disability in North America and can represent a significant source of financial burden. Early intervention is a key aspect in treating mental disorders as it can dramatically increase the probability of a positive outcome. One key factor to early intervention is the knowledge of risk-markers -- genetic, neural, behavioral and/or social deviations -- that indicate the development of a particular mental disorder. Once these risk-markers are known, it is important to have tools for reliable identification of these risk-markers. For visually observable risk-markers, discovery and screening ideally should occur in a natural environment. However, this often incurs a high cost. Current advances in technology allow for the development of assistive systems that could aid in the detection and screening of visually observable risk-markers in every-day environments, like a preschool classroom. This dissertation covers the development of such a system. The system consists of a series of networked sensors that are able to collect data from a wide baseline. These sensors generate color images and depth maps that can be used to create a 3D point cloud reconstruction of the classroom. The wide baseline nature of the setup helps to minimize the effects of occlusion, since data is captured from multiple distinct perspectives. These point clouds are used to detect occupants in the room and track them throughout their activities. This tracking information is then used to analyze classroom and individual behaviors, enabling the screening for specific risk-markers and also the ability to create a corpus of data that could be used to discover new risk-markers. This system has been installed at the Shirley G. Moore Lab school, a research preschool classroom in the Institute of Child Development at the University of Minnesota. Recordings have been taken and analyzed from actual classes. No instruction or pre-conditioning was given to the instructors or the children in these classes. Portions of this data have also been manually annotated to create groundtruth data that was used to validate the efficacy of the proposed system

    Advancing a Machine's Visual Awareness of People

    Get PDF
    Methods to advance a machine's visual awareness of people with a focus on understanding 'who is where' in video are presented. 'Who' is used in a broad sense that includes not only the identity of a person but attributes of that person as well. Efforts are focused on improving algorithms in four areas of visual recognition: detection, tracking, fine-grained classification and person reidentification. Each of these problems appear to be quite different on the surface; however, there are two broader questions that are answered across each of the works. The first, the machine is able to make better predictions when it has access to the extra information that is available in video. The second, that it is possible to learn on-the-fly from single examples. How each work contributes to answering these over-arching questions as well as its specific contributions to the relevant problem domain are as follows: The first problem studied is one-shot, real-time, instance detection. Given a single image of a person, the task for the machine is to learn a detector that is specific to that individual rather than to an entire category such as faces or pedestrians. In subsequent images, the individual detector indicates the size and location of that particular person in the image. The learning must be done in real-time. To solve this problem, the proposed method starts with a pre-trained boosted category detector from which an individual-object detector is trained, with near-zero computational cost, through elementary manipulations of the thresholds of the category detector. Experiments on two challenging pedestrian and face datasets indicate that it is indeed possible to learn identity classifiers in real-time; besides being faster-trained, the proposed classifier has better detection rates than previous methods. The second problem studied is real-time tracking. Given the initial location of a target person, the task for the machine is to determine the size and location of the target person in subsequent video frames, in real-time. The method proposed for solving this problem treats tracking as a repeated detection problem where potential targets are identified with a pre-trained boosted person detector and identity across frames is established by individual-specific detectors. The individual-specific detectors are learnt using the method proposed to solve the first problem. The proposed algorithm runs in real-time and is robust to drift. The tracking algorithm is benchmarked against nine state-of-the-art trackers on two benchmark datasets. Results show that the proposed method is 10% more accurate and nearly as fast as the fastest of the competing algorithms, and it is as accurate but 20 times faster than the most accurate of the competing algorithms. The third problem studied is the fine-grained classification of people. Given an image of a person, the task for the machine is to estimate characteristics of that person such as age, clothing style, sex, occupation, social status, ethnicity, emotional state and/or body type. Since fine-grained classification using the entire human body is a relatively unexplored area, a large video dataset was collected. To solve this problem, a method that uses deep neural networks and video of a person is proposed. Results show that the class average accuracy when combining information from a sequence of images of an individual and then predicting the label is 3.5-7.1% better than independently predicting the label of each image, when severely under-represented classes are ignored. The final problem studied is person reidentification. Given an image of a person, the task for the machine is to find images that match the identity of that person from a large set of candidate images. This is a challenging task since images of the same individual can vary significantly due to changes in clothing, viewpoint, pose, lighting and background. The method proposed for solving this problem is a two-stage deep neural network architecture that uses body part patches as inputs rather than an entire image of a person. Experiments show that rank-1 matching rates increase by 22-25.6% on benchmark datasets when compared to state-of-the-art methods.</p

    Person Re-Identification Techniques for Intelligent Video Surveillance Systems

    Get PDF
    Nowadays, intelligent video-surveillance is one of the most active research fields in com- puter vision and machine learning techniques which provides useful tools for surveillance operators and forensic video investigators. Person re-identification is among these tools; it consists of recognizing whether an individual has already been observed over a network of cameras. This tool can also be employed in various possible applications, e.g., off-line retrieval of all the video-sequences showing an individual of interest whose image is given as query, or on-line pedestrian tracking over multiple cameras. For the off-line retrieval applications, one of the goals of person re-identification systems is to support video surveillance operators and forensic investigators to find an individual of interest in videos acquired by a network of non-overlapping cameras. This is attained by sorting images of previously ob- served individuals for decreasing values of their similarity with a given probe individual. This task is typically achieved by exploiting the clothing appearance, in which a classical biometric methods like the face recognition is impeded to be practical in real-world video surveillance scenarios, because of low-quality of acquired images. Existing clothing appearance descriptors, together with their similarity measures, are mostly aimed at im- proving ranking quality. These methods usually are employed as part-based body model in order to extract image signature that might be independently treated in different body parts (e.g. torso and legs). Whereas, it is a must that a re-identification model to be robust and discriminate on individual of interest recognition, the issue of the processing time might also be crucial in terms of tackling this task in real-world scenarios. This issue can be also seen from two different point of views such as processing time to construct a model (aka descriptor generation); which usually can be done off-line, and processing time to find the correct individual from bunch of acquired video frames (aka descriptor matching); which is the real-time procedure of the re-identification systems. This thesis addresses the issue of processing time for descriptor matching, instead of im- proving ranking quality, which is also relevant in practical applications involving interaction with human operators. It will be shown how a trade-off between processing time and rank- ing quality, for any given descriptor, can be achieved through a multi-stage ranking approach inspired by multi-stage approaches to classification problems presented in pattern recogni- tion area, which it is further adapting to the re-identification task as a ranking problem. A discussion of design criteria is therefore presented as so-called multi-stage re-identification systems, and evaluation of the proposed approach carry out on three benchmark data sets, using four state-of-the-art descriptors. Additionally, by concerning to the issue of processing time, typical dimensional reduction methods are studied in terms of reducing the processing time of a descriptor where a high-dimensional feature space is generated by a specific person re-identification descriptor. An empirically experimental result is also presented in this case, and three well-known feature reduction methods are applied them on two state-of-the-art descriptors on two benchmark data sets

    A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges

    Full text link
    Vehicle re-identification (ReID) endeavors to associate vehicle images collected from a distributed network of cameras spanning diverse traffic environments. This task assumes paramount importance within the spectrum of vehicle-centric technologies, playing a pivotal role in deploying Intelligent Transportation Systems (ITS) and advancing smart city initiatives. Rapid advancements in deep learning have significantly propelled the evolution of vehicle ReID technologies in recent years. Consequently, undertaking a comprehensive survey of methodologies centered on deep learning for vehicle re-identification has become imperative and inescapable. This paper extensively explores deep learning techniques applied to vehicle ReID. It outlines the categorization of these methods, encompassing supervised and unsupervised approaches, delves into existing research within these categories, introduces datasets and evaluation criteria, and delineates forthcoming challenges and potential research directions. This comprehensive assessment examines the landscape of deep learning in vehicle ReID and establishes a foundation and starting point for future works. It aims to serve as a complete reference by highlighting challenges and emerging trends, fostering advancements and applications in vehicle ReID utilizing deep learning models

    Autonomous Person-Specific Following Robot

    Full text link
    Following a specific user is a desired or even required capability for service robots in many human-robot collaborative applications. However, most existing person-following robots follow people without knowledge of who it is following. In this paper, we proposed an identity-specific person tracker, capable of tracking and identifying nearby people, to enable person-specific following. Our proposed method uses a Sequential Nearest Neighbour with Thresholding Selection algorithm we devised to fuse together an anonymous person tracker and a face recogniser. Experiment results comparing our proposed method with alternative approaches showed that our method achieves better performance in tracking and identifying people, as well as improved robot performance in following a target individual
    • …
    corecore