232 research outputs found

    Face Detection and Recognition Using Raspberry PI Computer

    Get PDF
    This paper presents a face detection and recognition system utilizing a Raspberry Pi computer that is built on a predefined framework. The theoretical section of this article shows several techniques that can be used for face detection, including Haar cascades, Histograms of Oriented Gradients, Support Vector Machine and Deep Learning Methods. The paper also provides examples of some commonly used face recognition techniques, including Fisherfaces, Eigenfaces, Histogram of Local Binary Patterns, SIFT and SURF descriptor-based methods and Deep Learning Methods. The practical aspect of this paper demonstrates use of a Raspberry Pi computer, along with supplementary tools and software, to detect and recognize faces using a pre-defined dataset

    Cascaded Facial Detection Algorithms To Improve Recognition

    Get PDF
    The desire to be able to use computer programs to recognize certain biometric qualities of people have been desired by several different types of organizations. One of these qualities worked on and has achieved moderate success is facial detection and recognition. Being able to use computers to determine where and who a face is has generated several different algorithms to solve this problem with different benefits and drawbacks. At the backbone of each algorithm is the desire for it to be quick and accurate. By cascading face detection algorithms, accuracy can be improved but runtime will subsequently be increased. Neural networks, once trained, have the ability to quickly categorize objects and assign them identifiers. Combining cascaded face detectors and neural networks, a face in an image can be detected and recognized. In this paper, three different types of facial detection algorithms are combined in various configurations to test the accuracy of face detection at the cost of runtime. By feeding these faces into a convolution neural network, we can begin identifying who the person is

    Model-Based High-Dimensional Pose Estimation with Application to Hand Tracking

    Get PDF
    This thesis presents novel techniques for computer vision based full-DOF human hand motion estimation. Our main contributions are: A robust skin color estimation approach; A novel resolution-independent and memory efficient representation of hand pose silhouettes, which allows us to compute area-based similarity measures in near-constant time; A set of new segmentation-based similarity measures; A new class of similarity measures that work for nearly arbitrary input modalities; A novel edge-based similarity measure that avoids any problematic thresholding or discretizations and can be computed very efficiently in Fourier space; A template hierarchy to minimize the number of similarity computations needed for finding the most likely hand pose observed; And finally, a novel image space search method, which we naturally combine with our hierarchy. Consequently, matching can efficiently be formulated as a simultaneous template tree traversal and function maximization

    A Voting Algorithm for Dynamic Object Identification and Pose Estimation

    Get PDF
    While object identification enables autonomous vehicles to detect and recognize objects from real-time images, pose estimation further enhances their capability of navigating in a dynamically changing environment. This thesis proposes an approach which makes use of keypoint features from 3D object models for recognition and pose estimation of dynamic objects in the context of self-driving vehicles. A voting technique is developed to vote out a suitable model from the repository of 3D models that offers the best match with the dynamic objects in the input image. The matching is done based on the identified keypoints on the image and the keypoints corresponding to each template model stored in the repository. A confidence score value is then assigned to measure the confidence with which the system can confirm the presence of the matched object in the input image. Being dynamic objects with complex structure, human models in the COCO-DensePose dataset, along with the DensePose deep-learning model developed by the Facebook research team, have been adopted and integrated into the system for 3D pose estimation of pedestrians on the road. Additionally, object tracking is performed to find the speed and location details for each of the recognized dynamic objects from consecutive image frames of the input video. This research demonstrates with experimental results that the use of 3D object models enhances the confidence of recognition and pose estimation of dynamic objects in the real-time input image. The 3D pose information of the recognized dynamic objects along with their corresponding speed and location information would help the autonomous navigation system of the self-driving cars to take appropriate navigation decisions, thus ensuring smooth and safe driving

    Automated Multi-Modal Search and Rescue using Boosted Histogram of Oriented Gradients

    Get PDF
    Unmanned Aerial Vehicles (UAVs) provides a platform for many automated tasks and with an ever increasing advances in computing, these tasks can be more complex. The use of UAVs is expanded in this thesis with the goal of Search and Rescue (SAR), where a UAV can assist fast responders to search for a lost person and relay possible search areas back to SAR teams. To identify a person from an aerial perspective, low-level Histogram of Oriented Gradients (HOG) feature descriptors are used over a segmented region, provided from thermal data, to increase classification speed. This thesis also introduces a dataset to support a Bird’s-Eye-View (BEV) perspective and tests the viability of low level HOG feature descriptors on this dataset. The low-level feature descriptors are known as Boosted Histogram of Oriented Gradients (BHOG) features, which discretizes gradients over varying sized cells and blocks that are trained with a Cascaded Gentle AdaBoost Classifier using our compiled BEV dataset. The classification is supported by multiple sensing modes with color and thermal videos to increase classification speed. The thermal video is segmented to indicate any Region of Interest (ROI) that are mapped to the color video where classification occurs. The ROI decreases classification time needed for the aerial platform by eliminating a per-frame sliding window. Testing reveals that with the use of only color data iv and a classifier trained for a profile of a person, there is an average recall of 78%, while the thermal detection results with an average recall of 76%. However, there is a speed up of 2 with a video of 240x320 resolution. The BEV testing reveals that higher resolutions are favored with a recall rate of 71% using BHOG features, and 92% using Haar-Features. In the lower resolution BEV testing, the recall rates are 42% and 55%, for BHOG and Haar-Features, respectively

    Object Tracking and Mensuration in Surveillance Videos

    Get PDF
    This thesis focuses on tracking and mensuration in surveillance videos. The first part of the thesis discusses several object tracking approaches based on the different properties of tracking targets. For airborne videos, where the targets are usually small and with low resolutions, an approach of building motion models for foreground/background proposed in which the foreground target is simplified as a rigid object. For relatively high resolution targets, the non-rigid models are applied. An active contour-based algorithm has been introduced. The algorithm is based on decomposing the tracking into three parts: estimate the affine transform parameters between successive frames using particle filters; detect the contour deformation using a probabilistic deformation map, and regulate the deformation by projecting the updated model onto a trained shape subspace. The active appearance Markov chain (AAMC). It integrates a statistical model of shape, appearance and motion. In the AAMC model, a Markov chain represents the switching of motion phases (poses), and several pairwise active appearance model (P-AAM) components characterize the shape, appearance and motion information for different motion phases. The second part of the thesis covers video mensuration, in which we have proposed a heightmeasuring algorithm with less human supervision, more flexibility and improved robustness. From videos acquired by an uncalibrated stationary camera, we first recover the vanishing line and the vertical point of the scene. We then apply a single view mensuration algorithm to each of the frames to obtain height measurements. Finally, using the LMedS as the cost function and the Robbins-Monro stochastic approximation (RMSA) technique to obtain the optimal estimate

    Vision for Looking at Traffic Lights:Issues, Survey, and Perspectives

    Get PDF

    Comparison of Forward Vehicle Detection Using Haar-like features and Histograms of Oriented Gradients (HOG) Technique for Feature Extraction in Cascade Classifier

    Get PDF
    This paper present an algorithm development of vehicle detection system using image processing technique and comparison of the detection performance between two features extractor. The main focus is to implement the vehicle detection system using the on-board camera installed on host vehicle that records the moving road environment instead of using a static camera fixed in certain locations. In this paper, Cascade classifier is trained with image dataset of positive images and negative images. The positive images consist of rear area of the vehicle and negative image consist of road scene background. Two features extractor, Haar-like features and histograms of oriented gradients (HOG) are used for comparison in this system. The image dataset for training in both feature extractions are fixed in dimension. In comparison, the accuracy and execution time are studied based on its detection performance. Both features performed well in detection accuracy, whilst the results indicate that the Haar-like features execution time is 26% faster than by using HOG feature

    Vehicular Instrumentation and Data Processing for the Study of Driver Intent

    Get PDF
    The primary goal of this thesis is to provide processed experimental data needed to determine whether driver intentionality and driving-related actions can be predicted from quantitative and qualitative analysis of driver behaviour. Towards this end, an instrumented experimental vehicle capable of recording several synchronized streams of data from the surroundings of the vehicle, the driver gaze with head pose and the vehicle state in a naturalistic driving environment was designed and developed. Several driving data sequences in both urban and rural environments were recorded with the instrumented vehicle. These sequences were automatically annotated for relevant artifacts such as lanes, vehicles and safely driveable areas within road lanes. A framework and associated algorithms required for cross-calibrating the gaze tracking system with the world coordinate system mounted on the outdoor stereo system was also designed and implemented, allowing the mapping of the driver gaze with the surrounding environment. This instrumentation is currently being used for the study of driver intent, geared towards the development of driver maneuver prediction models

    Multi-camera face detection and recognition applied to people tracking

    Get PDF
    This thesis describes the design and implementation of a framework that can track and identify multiple people in a crowded scene captured by multiple cameras. A people detector is initially employed to estimate the position of individuals. Those positions estimates are used by the face detector to prune the search space of possible face locations and minimize the false positives. A face classifier is employed to assign identities to the trajectories. Apart from recognizing the people in the scene, the face information is exploited by the tracker to minimize identity switches. Only sparse face recognitions are required to generate identity-preserving trajectories. Three face detectors are evaluated based on the project requirements. The face model of a person is described by Local Binary Pattern (histogram) features extracted from a number of patches of the face, captured by different cameras. The face model is shared between cameras meaning that one camera can recognize a face relying on patches captured by a different camera. Three classifiers are tested for the recognition task and an SVM is eventually employed. Due to the properties of the LBP, the recognition is robust to illumination changes and facial expressions. Also the SVM is trained from multiple views of the face of each person making the recognition also robust to pose changes. The system is integrated with two trackers, the state-of-the-art Multi-Commodity Network Flow tracker and a frame-by-frame Kalman tracker. We validate our method on two datasets generated for this purpose. The integration of face information with the people tracker demonstrates excellent performance and significantly improves the tracking results on crowded scenes, while providing the identities of the people in the scene
    • …
    corecore