4,602 research outputs found

    From 3D Point Clouds to Pose-Normalised Depth Maps

    Get PDF
    We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)

    Pattern Encoding on the Poincare Sphere

    Full text link
    This paper presents a convenient graphical tool for encoding visual patterns (such as image patches and image atoms) as point constellations in a space spanned by perceptual features and with a clear geometrical interpretation. General theory and a practical pattern encoding scheme are presented, inspired by encoding polarization states of a light wave on the Poincare sphere. This new pattern encoding scheme can be useful for many applications in image processing and computer vision. Here, three possible applications are illustrated, in clustering perceptually similar patterns, visualizing properties of learned dictionaries of image atoms and generating new dictionaries of image atoms from spherical codes.Comment: 26 pages, 23 figure

    Measuring filament orientation: a new quantitative, local approach

    Full text link
    The relative orientation between filamentary structures in molecular clouds and the ambient magnetic field provides insight into filament formation and stability. To calculate the relative orientation, a measurement of filament orientation is first required. We propose a new method to calculate the orientation of the one pixel wide filament skeleton that is output by filament identification algorithms such as \textsc{filfinder}. We derive the local filament orientation from the direction of the intensity gradient in the skeleton image using the Sobel filter and a few simple post-processing steps. We call this the `Sobel-gradient method'. The resulting filament orientation map can be compared quantitatively on a local scale with the magnetic field orientation map to then find the relative orientation of the filament with respect to the magnetic field at each point along the filament. It can also be used in constructing radial profiles for filament width fitting. The proposed method facilitates automation in analysis of filament skeletons, which is imperative in this era of `big data'.Comment: 12 pages, 7 figures. Accepted for publication in ApJS, August 201

    MSPPIR: Multi-source privacy-preserving image retrieval in cloud computing

    Full text link
    Content-Based Image Retrieval (CBIR) techniques have been widely researched and in service with the help of cloud computing like Google Images. However, the images always contain rich sensitive information. In this case, the privacy protection become a big problem as the cloud always can't be fully trusted. Many privacy-preserving image retrieval schemes have been proposed, in which the image owner can upload the encrypted images to the cloud, and the owner himself or the authorized user can execute the secure retrieval with the help of cloud. Nevertheless, few existing researches notice the multi-source scene which is more practical. In this paper, we analyze the difficulties in Multi-Source Privacy-Preserving Image Retrieval (MSPPIR). Then we use the image in JPEG-format as the example, to propose a scheme called JES-MSIR, namely a novel JPEG image Encryption Scheme which is made for Multi-Source content-based Image Retrieval. JES-MSIR can support the requirements of MSPPIR, including the constant-rounds secure retrieval from multiple sources and the union of multiple sources for better retrieval services. Experiment results and security analysis on the proposed scheme show its efficiency, security and accuracy.Comment: this version adds notations and repair some mistake

    Supervised Dictionary Learning and Sparse Representation-A Review

    Full text link
    Dictionary learning and sparse representation (DLSR) is a recent and successful mathematical model for data representation that achieves state-of-the-art performance in various fields such as pattern recognition, machine learning, computer vision, and medical imaging. The original formulation for DLSR is based on the minimization of the reconstruction error between the original signal and its sparse representation in the space of the learned dictionary. Although this formulation is optimal for solving problems such as denoising, inpainting, and coding, it may not lead to optimal solution in classification tasks, where the ultimate goal is to make the learned dictionary and corresponding sparse representation as discriminative as possible. This motivated the emergence of a new category of techniques, which is appropriately called supervised dictionary learning and sparse representation (S-DLSR), leading to more optimal dictionary and sparse representation in classification tasks. Despite many research efforts for S-DLSR, the literature lacks a comprehensive view of these techniques, their connections, advantages and shortcomings. In this paper, we address this gap and provide a review of the recently proposed algorithms for S-DLSR. We first present a taxonomy of these algorithms into six categories based on the approach taken to include label information into the learning of the dictionary and/or sparse representation. For each category, we draw connections between the algorithms in this category and present a unified framework for them. We then provide guidelines for applied researchers on how to represent and learn the building blocks of an S-DLSR solution based on the problem at hand. This review provides a broad, yet deep, view of the state-of-the-art methods for S-DLSR and allows for the advancement of research and development in this emerging area of research

    MASSIVELY PARALLEL ALGORITHMS FOR POINT CLOUD BASED OBJECT RECOGNITION ON HETEROGENEOUS ARCHITECTURE

    Get PDF
    With the advent of new commodity depth sensors, point cloud data processing plays an increasingly important role in object recognition and perception. However, the computational cost of point cloud data processing is extremely high due to the large data size, high dimensionality, and algorithmic complexity. To address the computational challenges of real-time processing, this work investigates the possibilities of using modern heterogeneous computing platforms and its supporting ecosystem such as massively parallel architecture (MPA), computing cluster, compute unified device architecture (CUDA), and multithreaded programming to accelerate the point cloud based object recognition. The aforementioned computing platforms would not yield high performance unless the specific features are properly utilized. Failing that the result actually produces an inferior performance. To achieve the high-speed performance in image descriptor computing, indexing, and matching in point cloud based object recognition, this work explores both coarse and fine grain level parallelism, identifies the acceptable levels of algorithmic approximation, and analyzes various performance impactors. A set of heterogeneous parallel algorithms are designed and implemented in this work. These algorithms include exact and approximate scalable massively parallel image descriptors for descriptor computing, parallel construction of k-dimensional tree (KD-tree) and the forest of KD-trees for descriptor indexing, parallel approximate nearest neighbor search (ANNS) and buffered ANNS (BANNS) on the KD-tree and the forest of KD-trees for descriptor matching. The results show that the proposed massively parallel algorithms on heterogeneous computing platforms can significantly improve the execution time performance of feature computing, indexing, and matching. Meanwhile, this work demonstrates that the heterogeneous computing architectures, with appropriate architecture specific algorithms design and optimization, have the distinct advantages of improving the performance of multimedia applications

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    Graph Fourier Transform: A Stable Approximation

    Full text link
    In Graph Signal Processing (GSP), data dependencies are represented by a graph whose nodes label the data and the edges capture dependencies among nodes. The graph is represented by a weighted adjacency matrix AA that, in GSP, generalizes the Discrete Signal Processing (DSP) shift operator z−1z^{-1}. The (right) eigenvectors of the shift AA (graph spectral components) diagonalize AA and lead to a graph Fourier basis FF that provides a graph spectral representation of the graph signal. The inverse of the (matrix of the) graph Fourier basis FF is the Graph Fourier transform (GFT), F−1F^{-1}. Often, including in real world examples, this diagonalization is numerically unstable. This paper develops an approach to compute an accurate approximation to FF and F−1F^{-1}, while insuring their numerical stability, by means of solving a non convex optimization problem. To address the non-convexity, we propose an algorithm, the stable graph Fourier basis algorithm (SGFA) that we prove to exponentially increase the accuracy of the approximating FF per iteration. Likewise, we can apply SGFA to AHA^H and, hence, approximate the stable left eigenvectors for the graph shift AA and directly compute the GFT. We evaluate empirically the quality of SGFA by applying it to graph shifts AA drawn from two real world problems, the 2004 US political blogs graph and the Manhattan road map, carrying out a comprehensive study on tradeoffs between different SGFA parameters. We also confirm our conclusions by applying SGFA on very sparse and very dense directed Erd\H os-R\'enyi graphs.Comment: 16 pages, 17 figures. Originally submitted in -IEEE Transactions on Signal Processing- on 01-Aug-2019. Resubmitted on 12-Jan-2020. Accept with mandatory minor revisions. Resubmitted again on 30-April-202

    Data fusion strategies for energy efficiency in buildings: Overview, challenges and novel orientations

    Full text link
    Recently, tremendous interest has been devoted to develop data fusion strategies for energy efficiency in buildings, where various kinds of information can be processed. However, applying the appropriate data fusion strategy to design an efficient energy efficiency system is not straightforward; it requires a priori knowledge of existing fusion strategies, their applications and their properties. To this regard, seeking to provide the energy research community with a better understanding of data fusion strategies in building energy saving systems, their principles, advantages, and potential applications, this paper proposes an extensive survey of existing data fusion mechanisms deployed to reduce excessive consumption and promote sustainability. We investigate their conceptualizations, advantages, challenges and drawbacks, as well as performing a taxonomy of existing data fusion strategies and other contributing factors. Following, a comprehensive comparison of the state-of-the-art data fusion based energy efficiency frameworks is conducted using various parameters, including data fusion level, data fusion techniques, behavioral change influencer, behavioral change incentive, recorded data, platform architecture, IoT technology and application scenario. Moreover, a novel method for electrical appliance identification is proposed based on the fusion of 2D local texture descriptors, where 1D power signals are transformed into 2D space and treated as images. The empirical evaluation, conducted on three real datasets, shows promising performance, in which up to 99.68% accuracy and 99.52% F1 score have been attained. In addition, various open research challenges and future orientations to improve data fusion based energy efficiency ecosystems are explored
    • …
    corecore