89 research outputs found

    An Approach Of Features Extraction And Heatmaps Generation Based Upon Cnns And 3D Object Models

    Get PDF
    The rapid advancements in artificial intelligence have enabled recent progress of self-driving vehicles. However, the dependence on 3D object models and their annotations collected and owned by individual companies has become a major problem for the development of new algorithms. This thesis proposes an approach of directly using graphics models created from open-source datasets as the virtual representation of real-world objects. This approach uses Machine Learning techniques to extract 3D feature points and to create annotations from graphics models for the recognition of dynamic objects, such as cars, and for the verification of stationary and variable objects, such as buildings and trees. Moreover, it generates heat maps for the elimination of stationary/variable objects in real-time images before working on the recognition of dynamic objects. The proposed approach helps to bridge the gap between the virtual and physical worlds and to facilitate the development of new algorithms for self-driving vehicles

    Classification of Test Documents Based on Handwritten Student ID's Characteristics

    Get PDF
    AbstractThe bag of words (BoW) model is an efficient image representation technique for image categorization and annotation tasks. Building good feature vocabularies from automatically extracted image feature vectors produces discriminative feature words, which can improve the accuracy of image categorization tasks. In this paper we use feature vocabularies based biometric characteristic for identification on student ID and classification of students’ papers and various exam documents used at the University of Mostar. We demonstrated an experiment in which we used OpenCV as an image processing tool and tool for feature extraction. As regards to classification method, we used Neural Network for Recognition of Handwritten Digits (student ID). We tested out proposed method on MNIST test database and achieved recognition rate of 94,76% accuracy. The model is tested on digits which are extracted from the handwritten student exams and the accuracy of 82% is achieved (92% correctly classified digits)

    Biometric Person Identification Using Near-infrared Hand-dorsa Vein Images

    Get PDF
    Biometric recognition is becoming more and more important with the increasing demand for security, and more usable with the improvement of computer vision as well as pattern recognition technologies. Hand vein patterns have been recognised as a good biometric measure for personal identification due to many excellent characteristics, such as uniqueness and stability, as well as difficulty to copy or forge. This thesis covers all the research and development aspects of a biometric person identification system based on near-infrared hand-dorsa vein images. Firstly, the design and realisation of an optimised vein image capture device is presented. In order to maximise the quality of the captured images with relatively low cost, the infrared illumination and imaging theory are discussed. Then a database containing 2040 images from 102 individuals, which were captured by this device, is introduced. Secondly, image analysis and the customised image pre-processing methods are discussed. The consistency of the database images is evaluated using mean squared error (MSE) and peak signal-to-noise ratio (PSNR). Geometrical pre-processing, including shearing correction and region of interest (ROI) extraction, is introduced to improve image consistency. Image noise is evaluated using total variance (TV) values. Grey-level pre-processing, including grey-level normalisation, filtering and adaptive histogram equalisation are applied to enhance vein patterns. Thirdly, a gradient-based image segmentation algorithm is compared with popular algorithms in references like Niblack and Threshold Image algorithm to demonstrate its effectiveness in vein pattern extraction. Post-processing methods including morphological filtering and thinning are also presented. Fourthly, feature extraction and recognition methods are investigated, with several new approaches based on keypoints and local binary patterns (LBP) proposed. Through comprehensive comparison with other approaches based on structure and texture features as well as performance evaluation using the database created with 2040 images, the proposed approach based on multi-scale partition LBP is shown to provide the best recognition performance with an identification rate of nearly 99%. Finally, the whole hand-dorsa vein identification system is presented with a user interface for administration of user information and for person identification

    Integration of LiDAR and photogrammetric data for enhanced aerial triangulation and camera calibration

    Get PDF
    PhD ThesisThe integration of complementary airborne light detection and ranging (LiDAR) and photogrammetric data continues to receive attention from the relevant research communities. Such an approach requires the optimized registration of the two data types within a common coordinate reference frame and thus enables the cross-calibration of one information source against another. This research assumes airborne LiDAR as a reference dataset against which in-flight camera system calibration and validation can be performed. The novel methodology involves the production of dense photogrammetric point clouds derived using the simultaneous adjustment of GNSS/IMU data and a dense set of photogrammetric tie points. Quality of the generated photogrammetric dataset is further improved through introducing the self-calibration additional parameters in the combined adjustment. A robust least squares surface matching algorithm is then used to minimise the Euclidean distances between the two datasets. After successful matching, well distributed LiDAR-derived control points (LCPs) are automatically identified and extracted. Adjustment of the photogrammetric data is then repeated using extracted LCPs in a self-calibrating bundle adjustment. The research methodology was tested using two datasets acquired using different photogrammetric digital sensor systems, a Microsoft UltraCamX large format camera and an Applanix DSS322 medium format camera. Systematic sensitivity testing included the influence of the number and weighting of LCPs required to achieve optimised adjustment. For the UltraCamX block it was found that when the number of control points exceeded 80, the accuracy of the adjustment stabilized at c. 2 cm in all axes, regardless of point weighting. Results were also compared with those from reference calibration using surveyed ground control points in the test area, with good agreement found between the two. Similar results were obtained for the DSS322 block, with block accuracy stabilizing at 100 LCPs. Moreover, for the DSS322 camera, introducing self-calibration greatly improved the accuracy of aerial triangulation

    Efficient 3D Segmentation, Registration and Mapping for Mobile Robots

    Get PDF
    Sometimes simple is better! For certain situations and tasks, simple but robust methods can achieve the same or better results in the same or less time than related sophisticated approaches. In the context of robots operating in real-world environments, key challenges are perceiving objects of interest and obstacles as well as building maps of the environment and localizing therein. The goal of this thesis is to carefully analyze such problem formulations, to deduce valid assumptions and simplifications, and to develop simple solutions that are both robust and fast. All approaches make use of sensors capturing 3D information, such as consumer RGBD cameras. Comparative evaluations show the performance of the developed approaches. For identifying objects and regions of interest in manipulation tasks, a real-time object segmentation pipeline is proposed. It exploits several common assumptions of manipulation tasks such as objects being on horizontal support surfaces (and well separated). It achieves real-time performance by using particularly efficient approximations in the individual processing steps, subsampling the input data where possible, and processing only relevant subsets of the data. The resulting pipeline segments 3D input data with up to 30Hz. In order to obtain complete segmentations of the 3D input data, a second pipeline is proposed that approximates the sampled surface, smooths the underlying data, and segments the smoothed surface into coherent regions belonging to the same geometric primitive. It uses different primitive models and can reliably segment input data into planes, cylinders and spheres. A thorough comparative evaluation shows state-of-the-art performance while computing such segmentations in near real-time. The second part of the thesis addresses the registration of 3D input data, i.e., consistently aligning input captured from different view poses. Several methods are presented for different types of input data. For the particular application of mapping with micro aerial vehicles where the 3D input data is particularly sparse, a pipeline is proposed that uses the same approximate surface reconstruction to exploit the measurement topology and a surface-to-surface registration algorithm that robustly aligns the data. Optimization of the resulting graph of determined view poses then yields globally consistent 3D maps. For sequences of RGBD data this pipeline is extended to include additional subsampling steps and an initial alignment of the data in local windows in the pose graph. In both cases, comparative evaluations show a robust and fast alignment of the input data

    Real-Time Multi-Fisheye Camera Self-Localization and Egomotion Estimation in Complex Indoor Environments

    Get PDF
    In this work a real-time capable multi-fisheye camera self-localization and egomotion estimation framework is developed. The thesis covers all aspects ranging from omnidirectional camera calibration to the development of a complete multi-fisheye camera SLAM system based on a generic multi-camera bundle adjustment method

    Analyzing Handwritten and Transcribed Symbols in Disparate Corpora

    Get PDF
    Cuneiform tablets appertain to the oldest textual artifacts used for more than three millennia and are comparable in amount and relevance to texts written in Latin or ancient Greek. These tablets are typically found in the Middle East and were written by imprinting wedge-shaped impressions into wet clay. Motivated by the increased demand for computerized analysis of documents within the Digital Humanities, we develop the foundation for quantitative processing of cuneiform script. Using a 3D-Scanner to acquire a cuneiform tablet or manually creating line tracings are two completely different representations of the same type of text source. Each representation is typically processed with its own tool-set and the textual analysis is therefore limited to a certain type of digital representation. To homogenize these data source a unifying minimal wedge feature description is introduced. It is extracted by pattern matching and subsequent conflict resolution as cuneiform is written densely with highly overlapping wedges. Similarity metrics for cuneiform signs based on distinct assumptions are presented. (i) An implicit model represents cuneiform signs using undirected mathematical graphs and measures the similarity of signs with graph kernels. (ii) An explicit model approaches the problem of recognition by an optimal assignment between the wedge configurations of two signs. Further, methods for spotting cuneiform script are developed, combining the feature descriptors for cuneiform wedges with prior work on segmentation-free word spotting using part-structured models. The ink-ball model is adapted by treating wedge feature descriptors as individual parts. The similarity metrics and the adapted spotting model are both evaluated on a real-world dataset outperforming the state-of-the-art in cuneiform sign similarity and spotting. To prove the applicability of these methods for computational cuneiform analysis, a novel approach is presented for mining frequent constellations of wedges resulting in spatial n-grams. Furthermore, a method for automatized transliteration of tablets is evaluated by employing structured and sequential learning on a dataset of parallel sentences. Finally, the conclusion outlines how the presented methods enable the development of new tools and computational analyses, which are objective and reproducible, for quantitative processing of cuneiform script
    • …
    corecore