164 research outputs found

    Automatic face recognition using stereo images

    Get PDF
    Face recognition is an important pattern recognition problem, in the study of both natural and artificial learning problems. Compaxed to other biometrics, it is non-intrusive, non- invasive and requires no paxticipation from the subjects. As a result, it has many applications varying from human-computer-interaction to access control and law-enforcement to crowd surveillance. In typical optical image based face recognition systems, the systematic vaxiability arising from representing the three-dimensional (3D) shape of a face by a two-dimensional (21)) illumination intensity matrix is treated as random vaxiability. Multiple examples of the face displaying vaxying pose and expressions axe captured in different imaging conditions. The imaging environment, pose and expressions are strictly controlled and the images undergo rigorous normalisation and pre-processing. This may be implemented in a paxtially or a fully automated system. Although these systems report high classification accuracies (>90%), they lack versatility and tend to fail when deployed outside laboratory conditions. Recently, more sophisticated 3D face recognition systems haxnessing the depth information have emerged. These systems usually employ specialist equipment such as laser scanners and structured light projectors. Although more accurate than 2D optical image based recognition, these systems are equally difficult to implement in a non-co-operative environment. Existing face recognition systems, both 2D and 3D, detract from the main advantages of face recognition and fail to fully exploit its non-intrusive capacity. This is either because they rely too much on subject co-operation, which is not always available, or because they cannot cope with noisy data. The main objective of this work was to investigate the role of depth information in face recognition in a noisy environment. A stereo-based system, inspired by the human binocular vision, was devised using a pair of manually calibrated digital off-the-shelf cameras in a stereo setup to compute depth information. Depth values extracted from 2D intensity images using stereoscopy are extremely noisy, and as a result this approach for face recognition is rare. This was cofirmed by the results of our experimental work. Noise in the set of correspondences, camera calibration and triangulation led to inaccurate depth reconstruction, which in turn led to poor classifier accuracy for both 3D surface matching and 211) 2 depth maps. Recognition experiments axe performed on the Sheffield Dataset, consisting 692 images of 22 individuals with varying pose, illumination and expressions

    SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes

    Full text link
    The objective of this paper is 3D shape understanding from single and multiple images. To this end, we introduce a new deep-learning architecture and loss function, SilNet, that can handle multiple views in an order-agnostic manner. The architecture is fully convolutional, and for training we use a proxy task of silhouette prediction, rather than directly learning a mapping from 2D images to 3D shape as has been the target in most recent work. We demonstrate that with the SilNet architecture there is generalisation over the number of views -- for example, SilNet trained on 2 views can be used with 3 or 4 views at test-time; and performance improves with more views. We introduce two new synthetics datasets: a blobby object dataset useful for pre-training, and a challenging and realistic sculpture dataset; and demonstrate on these datasets that SilNet has indeed learnt 3D shape. Finally, we show that SilNet exceeds the state of the art on the ShapeNet benchmark dataset, and use SilNet to generate novel views of the sculpture dataset.Comment: BMVC 2017; Best Poste

    Recognizing Large Isolated 3-D Objects Through Next View Planning Using Inner Camera Invariants

    Full text link

    Automatic visual recognition using parallel machines

    Get PDF
    Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

    Scalable and adaptable tracking of humans in multiple camera systems

    Get PDF
    The aim of this thesis is to track objects on a network of cameras both within [intra) and across (inter) cameras. The algorithms must be adaptable to change and are learnt in a scalable approach. Uncalibrated cameras are used that are patially separated, and therefore tracking must be able to cope with object oclusions, illuminations changes, and gaps between cameras.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Adaptive Vision Based Scene Registration for Outdoor Augmented Reality

    Get PDF
    Augmented Reality (AR) involves adding virtual content into real scenes. Scenes are viewed using a Head-Mounted Display or other display type. In order to place content into the user's view of a scene, the user's position and orientation relative to the scene, commonly referred to as their pose, must be determined accurately. This allows the objects to be placed in the correct positions and to remain there when the user moves or the scene changes. It is achieved by tracking the user in relation to their environment using a variety of technology. One technology which has proven to provide accurate results is computer vision. Computer vision involves a computer analysing images and achieving an understanding of them. This may be locating objects such as faces in the images, or in the case of AR, determining the pose of the user. One of the ultimate goals of AR systems is to be capable of operating under any condition. For example, a computer vision system must be robust under a range of different scene types, and under unpredictable environmental conditions due to variable illumination and weather. The majority of existing literature tests algorithms under the assumption of ideal or 'normal' imaging conditions. To ensure robustness under as many circumstances as possible it is also important to evaluate the systems under adverse conditions. This thesis seeks to analyse the effects that variable illumination has on computer vision algorithms. To enable this analysis, test data is required to isolate weather and illumination effects, without other factors such as changes in viewpoint that would bias the results. A new dataset is presented which also allows controlled viewpoint differences in the presence of weather and illumination changes. This is achieved by capturing video from a camera undergoing a repeatable motion sequence. Ground truth data is stored per frame allowing images from the same position under differing environmental conditions, to be easily extracted from the videos. An in depth analysis of six detection algorithms and five matching techniques demonstrates the impact that non-uniform illumination changes can have on vision algorithms. Specifically, shadows can degrade performance and reduce confidence in the system, decrease reliability, or even completely prevent successful operation. An investigation into approaches to improve performance yields techniques that can help reduce the impact of shadows. A novel algorithm is presented that merges reference data captured at different times, resulting in reference data with minimal shadow effects. This can significantly improve performance and reliability when operating on images containing shadow effects. These advances improve the robustness of computer vision systems and extend the range of conditions in which they can operate. This can increase the usefulness of the algorithms and the AR systems that employ them

    Hierarchical Classification of Scientific Taxonomies with Autonomous Underwater Vehicles

    Get PDF
    Autonomous Underwater Vehicles (AUVs) have catalysed a significant shift in the way marine habitats are studied. It is now possible to deploy an AUV from a ship, and capture tens of thousands of georeferenced images in a matter of hours. There is a growing body of research investigating ways to automatically apply semantic labels to this data, with two goals. The task of manually labelling a large number of images is time consuming and error prone. Further, there is the potential to change AUV surveys from being geographically defined (based on a pre-planned route), to permitting the AUV to adapt the mission plan in response to semantic observations. This thesis focusses on frameworks that permit a unified machine learning approach with applicability to a wide range of geographic areas, and diverse areas of interest for marine scientists. This can be addressed through the use of hierarchical classification; in which machine learning algorithms are trained to predict not just a binary or multi-class outcome, but a hierarchy of related output labels which are not mutually exclusive, such as a scientific taxonomy. In order to investigate classification on larger hierarchies with greater geographic diversity, the BENTHOZ-2015 data set was assembled as part of a collaboration between five Australian research groups. Existing labelled data was re-mapped to the CATAMI hierarchy, in total more than 400,000 point labels, conforming to a hierarchy of around 150 classes. The common hierarchical classification approach of building a network of binary classifiers was applied to the BENTHOZ-2015 data set, and a novel application of Bayesian Network theory and probability calibration was used as a theoretical foundation for the approach, resulting in improved classifier performance. This was extended to a more complex hidden node Bayesian Network structure, which permits inclusion of additional sensor modalities, and tuning for better performance in particular geographic regions
    corecore