44 research outputs found

    Exploiting Cross Domain Relationships for Target Recognition

    Get PDF
    Cross domain recognition extracts knowledge from one domain to recognize samples from another domain of interest. The key to solving problems under this umbrella is to find out the latent connections between different domains. In this dissertation, three different cross domain recognition problems are studied by exploiting the relationships between different domains explicitly according to the specific real problems. First, the problem of cross view action recognition is studied. The same action might seem quite different when observed from different viewpoints. Thus, how to use the training samples from a given camera view and perform recognition in another new view is the key point. In this work, reconstructable paths between different views are built to mirror labeled actions from one source view into one another target view for learning an adaptable classifier. The path learning takes advantage of the joint dictionary learning techniques with exploiting hidden information in the seemingly useless samples, making the recognition performance robust and effective. Second, the problem of person re-identification is studied, which tries to match pedestrian images in non-overlapping camera views based on appearance features. In this work, we propose to learn a random kernel forest to discriminatively assign a specific distance metric to each pair of local patches from the two images in matching. The forest is composed by multiple decision trees, which are designed to partition the overall space of local patch-pairs into substantial subspaces, where a simple but effective local metric kernel can be defined to minimize the distance of true matches. Third, the problem of multi-event detection and recognition in smart grid is studied. The signal of multi-event might not be a straightforward combination of some single-event signals because of the correlation among devices. In this work, a concept of ``root-pattern\u27\u27 is proposed that can be extracted from a collection of single-event signals, but also transferable to analyse the constituent components of multi-cascading-event signals based on an over-complete dictionary, which is designed according to the ``root-patterns\u27\u27 with temporal information subtly embedded. The correctness and effectiveness of the proposed approaches have been evaluated by extensive experiments

    A Vector Signal Processing Approach to Color

    Get PDF
    Surface (Lambertain) color is a useful visual cue for analyzing material composition of scenes. This thesis adopts a signal processing approach to color vision. It represents color images as fields of 3D vectors, from which we extract region and boundary information. The first problem we face is one of secondary imaging effects that makes image color different from surface color. We demonstrate a simple but effective polarization based technique that corrects for these effects. We then propose a systematic approach of scalarizing color, that allows us to augment classical image processing tools and concepts for multi-dimensional color signals

    Computational Topology Methods for Shape Modelling Applications

    Get PDF
    This thesis deals with computational topology, a recent branch of research that involves both mathematics and computer science, and tackles the problem of discretizing the Morse theory to functions defined on a triangle mesh. The application context of Morse theory in general, and Reeb graphs in particular, deals with the analysis of geometric shapes and the extraction of skeletal structures that synthetically represents shape, preserving the topological properties and the main morphological characteristics. Regarding Computer Graphics, shapes, that is a one-, two- or higher- dimensional connected, compact space having a visual appearance, are typically approximated by digital models. Since topology focuses on the qualitative properties of spaces, such as the connectedness and how many and what type of holes it has, topology is the best tool to describe the shape of a mathematical model at a high level of abstraction. Geometry, conversely, is mainly related to the quantitative characteristics of a shape. Thus, the combination of topology and geometry creates a new generation of tools that provide a computational description of the most representative features of the shape along with their relationship. Extracting qualitative information, that is the information related to semantic of the shape and its morphological structure, from discrete models is a central goal in shape modeling. In this thesis a conceptual model is proposed which represents a given surface based on topological coding that defines a sketch of the surface, discarding irrelevant details and classifying its topological type. The approach is based on Morse theory and Reeb graphs, which provide a very useful shape abstraction method for the analysis and structuring of the information contained in the geometry of the discrete shape model. To fully develop the method, both theoretical and computational aspects have been considered, related to the definition and the extension of the Reeb graph to the discrete domain. For the definition and automatic construction of the conceptual model, a new method has been developed that analyzes and characterizes a triangle mesh with respect to the behavior of a real and at least continuous function defined on the mesh. The proposed solution handles also degenerate critical points, such as non-isolated critical points. To do that, the surface model is characterized using a contour-based strategy, recognizing critical areas instead of critical points and coding the evolution of the contour levels in a graph-like structure, named Extended Reeb Graph, (ERG), which is a high-level abstract model suitable for representing and manipulating piece-wise linear surfaces. The descriptive power of the (ERG) has been also augmented with the introduction of geometric information together with the topological ones, and it has been also studied the relation between the extracted topological and morphological features with respect to the real characteristics of the surface, giving and evaluation of the dimension of the discarded details. Finally, the effectiveness of our description framework has been evaluated in several application contexts

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Scene Reconstruction from Multi-Scale Input Data

    Get PDF
    Geometry acquisition of real-world objects by means of 3D scanning or stereo reconstruction constitutes a very important and challenging problem in computer vision. 3D scanners and stereo algorithms usually provide geometry from one viewpoint only, and several of the these scans need to be merged into one consistent representation. Scanner data generally has lower noise levels than stereo methods and the scanning scenario is more controlled. In image-based stereo approaches, the aim is to reconstruct the 3D surface of an object solely from multiple photos of the object. In many cases, the stereo geometry is contaminated with noise and outliers, and exhibits large variations in scale. Approaches that fuse such data into one consistent surface must be resilient to such imperfections. In this thesis, we take a closer look at geometry reconstruction using both scanner data and the more challenging image-based scene reconstruction approaches. In particular, this work focuses on the uncontrolled setting where the input images are not constrained, may be taken with different camera models, under different lighting and weather conditions, and from vastly different points of view. A typical dataset contains many views that observe the scene from an overview perspective, and relatively few views capture small details of the geometry. What results from these datasets are surface samples of the scene with vastly different resolution. As we will show in this thesis, the multi-resolution, or, "multi-scale" nature of the input is a relevant aspect for surface reconstruction, which has rarely been considered in literature yet. Integrating scale as additional information in the reconstruction process can make a substantial difference in surface quality. We develop and study two different approaches for surface reconstruction that are able to cope with the challenges resulting from uncontrolled images. The first approach implements surface reconstruction by fusion of depth maps using a multi-scale hierarchical signed distance function. The hierarchical representation allows fusion of multi-resolution depth maps without mixing geometric information at incompatible scales, which preserves detail in high-resolution regions. An incomplete octree is constructed by incrementally adding triangulated depth maps to the hierarchy, which leads to scattered samples of the multi-resolution signed distance function. A continuous representation of the scattered data is defined by constructing a tetrahedral complex, and a final, highly-adaptive surface is extracted by applying the Marching Tetrahedra algorithm. A second, point-based approach is based on a more abstract, multi-scale implicit function defined as a sum of basis functions. Each input sample contributes a single basis function which is parameterized solely by the sample's attributes, effectively yielding a parameter-free method. Because the scale of each sample controls the size of the basis function, the method automatically adapts to data redundancy for noise reduction and is highly resilient to the quality-degrading effects of low-resolution samples, thus favoring high-resolution surfaces. Furthermore, we present a robust, image-based reconstruction system for surface modeling: MVE, the Multi-View Environment. The implementation provides all steps involved in the pipeline: Calibration and registration of the input images, dense geometry reconstruction by means of stereo, a surface reconstruction step and post-processing, such as remeshing and texturing. In contrast to other software solutions for image-based reconstruction, MVE handles large, uncontrolled, multi-scale datasets as well as input from more controlled capture scenarios. The reason lies in the particular choice of the multi-view stereo and surface reconstruction algorithms. The resulting surfaces are represented using a triangular mesh, which is a piecewise linear approximation to the real surface. The individual triangles are often so small that they barely contribute any geometric information and can be ill-shaped, which can cause numerical problems. A surface remeshing approach is introduced which changes the surface discretization such that more favorable triangles are created. It distributes the vertices of the mesh according to a density function, which is derived from the curvature of the geometry. Such a mesh is better suited for further processing and has reduced storage requirements. We thoroughly compare the developed methods against the state-of-the art and also perform a qualitative evaluation of the two surface reconstruction methods on a wide range of datasets with different properties. The usefulness of the remeshing approach is demonstrated on both scanner and multi-view stereo data

    Minimizing Computational Resources for Deep Machine Learning: A Compression and Neural Architecture Search Perspective for Image Classification and Object Detection

    Get PDF
    Computational resources represent a significant bottleneck across all current deep learning computer vision approaches. Image and video data storage requirements for training deep neural networks have led to the widespread use of image and video compression, the use of which naturally impacts the performance of neural network architectures during both training and inference. The prevalence of deep neural networks deployed on edge devices necessitates efficient network architecture design, while training neural networks requires significant time and computational resources, despite the acceleration of both hardware and software developments within the field of artificial intelligence (AI). This thesis addresses these challenges in order to minimize computational resource requirements across the entire end-to-end deep learning pipeline. We determine the extent to which data compression impacts neural network architecture performance, and by how much this performance can be recovered by retraining neural networks with compressed data. The thesis then focuses on the accessibility of the deployment of neural architecture search (NAS) to facilitate automatic network architecture generation for image classification suited to resource-constrained environments. A combined hard example mining and curriculum learning strategy is developed to minimize the image data processed during a given training epoch within the NAS search phase, without diminishing performance. We demonstrate the capability of the proposed framework across all gradient-based, reinforcement learning, and evolutionary NAS approaches, and a simple but effective method to extend the approach to the prediction-based NAS paradigm. The hard example mining approach within the proposed NAS framework depends upon the effectiveness of an autoencoder to regulate the latent space such that similar images have similar feature embeddings. This thesis conducts a thorough investigation to satisfy this constraint within the context of image classification. Based upon the success of the overall proposed NAS framework, we subsequently extend the approach towards object detection. Despite the resultant multi-label domain presenting a more difficult challenge for hard example mining, we propose an extension to the autoencoder to capture the additional object location information encoded within the training labels. The generation of an implicit attention layer within the autoencoder network sufficiently improves its capability to enforce similar images to have similar embeddings, thus successfully transferring the proposed NAS approach to object detection. Finally, the thesis demonstrates the resilience to compression of the general two-stage NAS approach upon which our proposed NAS framework is based

    Mapping Fingerprints to Unique Numbers

    Get PDF
    As automated fingerprint recognition systems gain popularity, the proliferation of information about unchangeable biometric characteristics causes serious privacy and security concerns. This information may enable an impostor to create a matching fingerprint, and the stored information should therefore be considered extremely sensitive. This thesis explores a novel method for generating cancellable fingerprint templates that will impede the reproduction of a fingerprint from the stored template, and at the same time allow the same fingerprint to be reused in the case of a compromise. During enrollment, the proposed method aligns the minutiae points of a fingerprint to a reference coordinate system using the core and principal direction, and creates a hash value based on the set of minutiae points. It then generates Reed-Solomon error correction codes which enable the reproduction of the full set of minutiae points if a certain number of minutiae points are known. It then performs an irreversible Cartesian block transformation on the minutiae points. During the matching process, the minutiae points of the candidate print are similarly aligned, and transformed using the same Cartesian transformation. A standard matching algorithm is performed on the minutiae sets in the transformed space, which allows the Cartesian transformation to be reversed for the matching minutiae points in the enrolled template. Using the Reed-Solomon error-correction codes generated during enrollment, the entire minutiae point set of the enrolled print can be recreated, provided enough minutiae points could be correctly reversed. Thus, a matching candidate fingerprint allows an otherwise irreversible transformation on the enrolled print to be reversed. The same hash value created for the fingerprint during enrollment can thus be re-generated when a matching fingerprint is presented. A proof-of-concept implementation of the method is presented and tested. Although the recognition accuracy of the proposed method was found to be inferior to comparable traditional fingerprint recognition methods, the method nonetheles

    History of Computer Art

    Get PDF
    A large text presents the history of Computer Art. The history of the artistic uses of computers and computing processes is reconstructed from its beginnings in the fifties to its present state. It points out hypertextual, modular and generative modes to use computing processes in Computer Art and features examples of early developments in media like cybernetic sculptures, video tools, computer graphics and animation (including music videos and demos), video and computer games, pervasive games, reactive installations, virtual reality, evolutionary art and net art. The functions of relevant art works are explained more detailed than is usual in such histories. From October 2011 to December 2012 the chapters have been published successively in German (The English translation started in August 2013 and was completed in June 2014)
    corecore