833 research outputs found

    Image Reconstruction from Bag-of-Visual-Words

    Full text link
    The objective of this work is to reconstruct an original image from Bag-of-Visual-Words (BoVW). Image reconstruction from features can be a means of identifying the characteristics of features. Additionally, it enables us to generate novel images via features. Although BoVW is the de facto standard feature for image recognition and retrieval, successful image reconstruction from BoVW has not been reported yet. What complicates this task is that BoVW lacks the spatial information for including visual words. As described in this paper, to estimate an original arrangement, we propose an evaluation function that incorporates the naturalness of local adjacency and the global position, with a method to obtain related parameters using an external image database. To evaluate the performance of our method, we reconstruct images of objects of 101 kinds. Additionally, we apply our method to analyze object classifiers and to generate novel images via BoVW

    Objects extraction and recognition for camera-based interaction : heuristic and statistical approaches

    Get PDF
    In this thesis, heuristic and probabilistic methods are applied to a number of problems for camera-based interactions. The goal is to provide solutions for a vision based system that is able to extract and analyze interested objects in camera images and to use that information for various interactions for mobile usage. New methods and new attempts of combination of existing methods are developed for different applications, including text extraction from complex scene images, bar code reading performed by camera phones, and face/facial feature detection and facial expression manipulation. The application-driven problems of camera-based interaction can not be modeled by a uniform and straightforward model that has very strong simplifications of reality. The solutions we learned to be efficient were to apply heuristic but easy of implementation approaches at first to reduce the complexity of the problems and search for possible means, then use developed statistical learning approaches to deal with the remaining difficult but well-defined problems and get much better accuracy. The process can be evolved in some or all of the stages, and the combination of the approaches is problem-dependent. Contribution of this thesis resides in two aspects: firstly, new features and approaches are proposed either as heuristics or statistical means for concrete applications; secondly engineering design combining seveal methods for system optimization is studied. Geometrical characteristics and the alignment of text, texture features of bar codes, and structures of faces can all be extracted as heuristics for object extraction and further recognition. The boosting algorithm is one of the proper choices to perform probabilistic learning and to achieve desired accuracy. New feature selection techniques are proposed for constructing the weak learner and applying the boosting output in concrete applications. Subspace methods such as manifold learning algorithms are introduced and tailored for facial expression analysis and synthesis. A modified generalized learning vector quantization method is proposed to deal with the blurring of bar code images. Efficient implementations that combine the approaches in a rational joint point are presented and the results are illustrated.reviewe

    Pixel-level Image Fusion Algorithms for Multi-camera Imaging System

    Get PDF
    This thesis work is motivated by the potential and promise of image fusion technologies in the multi sensor image fusion system and applications. With specific focus on pixel level image fusion, the process after the image registration is processed, we develop graphic user interface for multi-sensor image fusion software using Microsoft visual studio and Microsoft Foundation Class library. In this thesis, we proposed and presented some image fusion algorithms with low computational cost, based upon spatial mixture analysis. The segment weighted average image fusion combines several low spatial resolution data source from different sensors to create high resolution and large size of fused image. This research includes developing a segment-based step, based upon stepwise divide and combine process. In the second stage of the process, the linear interpolation optimization is used to sharpen the image resolution. Implementation of these image fusion algorithms are completed based on the graphic user interface we developed. Multiple sensor image fusion is easily accommodated by the algorithm, and the results are demonstrated at multiple scales. By using quantitative estimation such as mutual information, we obtain the experiment quantifiable results. We also use the image morphing technique to generate fused image sequence, to simulate the results of image fusion. While deploying our pixel level image fusion algorithm approaches, we observe several challenges from the popular image fusion methods. While high computational cost and complex processing steps of image fusion algorithms provide accurate fused results, they also makes it hard to become deployed in system and applications that require real-time feedback, high flexibility and low computation abilit

    Study on pattern recognition techniques based on pattern space analysis methodology

    Get PDF
    制度:新 ; 文部省報告番号:乙2153号 ; 学位の種類:博士(工学) ; 授与年月日:2008/2/25 ; 早大学位記番号:新471

    Deep Learning based Fingerprint Presentation Attack Detection: A Comprehensive Survey

    Full text link
    The vulnerabilities of fingerprint authentication systems have raised security concerns when adapting them to highly secure access-control applications. Therefore, Fingerprint Presentation Attack Detection (FPAD) methods are essential for ensuring reliable fingerprint authentication. Owing to the lack of generation capacity of traditional handcrafted based approaches, deep learning-based FPAD has become mainstream and has achieved remarkable performance in the past decade. Existing reviews have focused more on hand-cratfed rather than deep learning-based methods, which are outdated. To stimulate future research, we will concentrate only on recent deep-learning-based FPAD methods. In this paper, we first briefly introduce the most common Presentation Attack Instruments (PAIs) and publicly available fingerprint Presentation Attack (PA) datasets. We then describe the existing deep-learning FPAD by categorizing them into contact, contactless, and smartphone-based approaches. Finally, we conclude the paper by discussing the open challenges at the current stage and emphasizing the potential future perspective.Comment: 29 pages, submitted to ACM computing survey journa

    Processing mesh animations: from static to dynamic geometry and back

    Get PDF
    Static triangle meshes are the representation of choice for artificial objects, as well as for digital replicas of real objects. They have proven themselves to be a solid foundation for further processing. Although triangle meshes are handy in general, it may seem that their discrete approximation of reality is a downside. But in fact, the opposite is true. The approximation of the real object's shape remains the same, even if we willfully change the vertex positions in the mesh, which allows us to optimize it in this way. Due to modern acquisition methods, such a step is always beneficial, often even required, prior to further processing of the acquired triangle mesh. Therefore, we present a general framework for optimizing surface meshes with respect to various target criteria. Because of the simplicity and efficiency of the setup it can be adapted to a variety of applications. Although this framework was initially designed for single static meshes, the application to a set of meshes is straightforward. For example, we convert a set of meshes into compatible ones and use them as basis for creating dynamic geometry. Consequently, we propose an interpolation method which is able to produce visually plausible interpolation results, even if the compatible input meshes differ by large rotations. The method can be applied to any number of input vertex configurations and due to the utilization of a hierarchical scheme, the approach is fast and can be used for very large meshes. Furthermore, we consider the opposite direction. Given an animation sequence, we propose a pre-processing algorithm that considerably reduces the number of meshes required to describe the sequence, thus yielding a compact representation. Our method is based on a clustering and classification approach, which can be utilized to automatically find the most prominent meshes of the sequence. The original meshes can then be expressed as linear combinations of these few representative meshes with only small approximation errors. Finally, we investigate the shape space spanned by those few meshes and show how to apply different interpolation schemes to create other shape spaces, which are not based on vertex coordinates. We conclude with a careful analysis of these shape spaces and their usability for a compact representation of an animation sequence

    Intellectual Property Rights in Virtual Environments: Considering the Rights of Owners, Programmers and Virtual Avatars

    Get PDF
    A virtual environment is a computer-generated world that can be used for training, data visualization, recreation, and commerce. The visitors of virtual environments include not only humans but also virtual avatars. The avatars can take on a range of shapes, characteristics, and personalities, and can perform a variety of tasks within the virtual environment. As the behavior of avatars becomes more realistic, sophisticated and intelligent- and the avatars become more autonomous in their decision making, the question of whether virtual avatars should have legal rights separate from those of their owner, becomes an issue. This paper discusses legal rights associated with the design and use of virtual avatars, commenting on the ownership rights of the creators of virtual avatars and the rights of avatars themselves should they gain intelligence and become independent decision makers and creators of intellectual property
    corecore