4,602 research outputs found
From 3D Point Clouds to Pose-Normalised Depth Maps
We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)
Pattern Encoding on the Poincare Sphere
This paper presents a convenient graphical tool for encoding visual patterns
(such as image patches and image atoms) as point constellations in a space
spanned by perceptual features and with a clear geometrical interpretation.
General theory and a practical pattern encoding scheme are presented, inspired
by encoding polarization states of a light wave on the Poincare sphere. This
new pattern encoding scheme can be useful for many applications in image
processing and computer vision. Here, three possible applications are
illustrated, in clustering perceptually similar patterns, visualizing
properties of learned dictionaries of image atoms and generating new
dictionaries of image atoms from spherical codes.Comment: 26 pages, 23 figure
Measuring filament orientation: a new quantitative, local approach
The relative orientation between filamentary structures in molecular clouds
and the ambient magnetic field provides insight into filament formation and
stability. To calculate the relative orientation, a measurement of filament
orientation is first required. We propose a new method to calculate the
orientation of the one pixel wide filament skeleton that is output by filament
identification algorithms such as \textsc{filfinder}. We derive the local
filament orientation from the direction of the intensity gradient in the
skeleton image using the Sobel filter and a few simple post-processing steps.
We call this the `Sobel-gradient method'. The resulting filament orientation
map can be compared quantitatively on a local scale with the magnetic field
orientation map to then find the relative orientation of the filament with
respect to the magnetic field at each point along the filament. It can also be
used in constructing radial profiles for filament width fitting. The proposed
method facilitates automation in analysis of filament skeletons, which is
imperative in this era of `big data'.Comment: 12 pages, 7 figures. Accepted for publication in ApJS, August 201
MSPPIR: Multi-source privacy-preserving image retrieval in cloud computing
Content-Based Image Retrieval (CBIR) techniques have been widely researched
and in service with the help of cloud computing like Google Images. However,
the images always contain rich sensitive information. In this case, the privacy
protection become a big problem as the cloud always can't be fully trusted.
Many privacy-preserving image retrieval schemes have been proposed, in which
the image owner can upload the encrypted images to the cloud, and the owner
himself or the authorized user can execute the secure retrieval with the help
of cloud. Nevertheless, few existing researches notice the multi-source scene
which is more practical. In this paper, we analyze the difficulties in
Multi-Source Privacy-Preserving Image Retrieval (MSPPIR). Then we use the image
in JPEG-format as the example, to propose a scheme called JES-MSIR, namely a
novel JPEG image Encryption Scheme which is made for Multi-Source content-based
Image Retrieval. JES-MSIR can support the requirements of MSPPIR, including the
constant-rounds secure retrieval from multiple sources and the union of
multiple sources for better retrieval services. Experiment results and security
analysis on the proposed scheme show its efficiency, security and accuracy.Comment: this version adds notations and repair some mistake
Supervised Dictionary Learning and Sparse Representation-A Review
Dictionary learning and sparse representation (DLSR) is a recent and
successful mathematical model for data representation that achieves
state-of-the-art performance in various fields such as pattern recognition,
machine learning, computer vision, and medical imaging. The original
formulation for DLSR is based on the minimization of the reconstruction error
between the original signal and its sparse representation in the space of the
learned dictionary. Although this formulation is optimal for solving problems
such as denoising, inpainting, and coding, it may not lead to optimal solution
in classification tasks, where the ultimate goal is to make the learned
dictionary and corresponding sparse representation as discriminative as
possible. This motivated the emergence of a new category of techniques, which
is appropriately called supervised dictionary learning and sparse
representation (S-DLSR), leading to more optimal dictionary and sparse
representation in classification tasks. Despite many research efforts for
S-DLSR, the literature lacks a comprehensive view of these techniques, their
connections, advantages and shortcomings. In this paper, we address this gap
and provide a review of the recently proposed algorithms for S-DLSR. We first
present a taxonomy of these algorithms into six categories based on the
approach taken to include label information into the learning of the dictionary
and/or sparse representation. For each category, we draw connections between
the algorithms in this category and present a unified framework for them. We
then provide guidelines for applied researchers on how to represent and learn
the building blocks of an S-DLSR solution based on the problem at hand. This
review provides a broad, yet deep, view of the state-of-the-art methods for
S-DLSR and allows for the advancement of research and development in this
emerging area of research
MASSIVELY PARALLEL ALGORITHMS FOR POINT CLOUD BASED OBJECT RECOGNITION ON HETEROGENEOUS ARCHITECTURE
With the advent of new commodity depth sensors, point cloud data processing plays an increasingly important role in object recognition and perception. However, the computational cost of point cloud data processing is extremely high due to the large data size, high dimensionality, and algorithmic complexity. To address the computational challenges of real-time processing, this work investigates the possibilities of using modern heterogeneous computing platforms and its supporting ecosystem such as massively parallel architecture (MPA), computing cluster, compute unified device architecture (CUDA), and multithreaded programming to accelerate the point cloud based object recognition. The aforementioned computing platforms would not yield high performance unless the specific features are properly utilized. Failing that the result actually produces an inferior performance. To achieve the high-speed performance in image descriptor computing, indexing, and matching in point cloud based object recognition, this work explores both coarse and fine grain level parallelism, identifies the acceptable levels of algorithmic approximation, and analyzes various performance impactors. A set of heterogeneous parallel algorithms are designed and implemented in this work. These algorithms include exact and approximate scalable massively parallel image descriptors for descriptor computing, parallel construction of k-dimensional tree (KD-tree) and the forest of KD-trees for descriptor indexing, parallel approximate nearest neighbor search (ANNS) and buffered ANNS (BANNS) on the KD-tree and the forest of KD-trees for descriptor matching. The results show that the proposed massively parallel algorithms on heterogeneous computing platforms can significantly improve the execution time performance of feature computing, indexing, and matching. Meanwhile, this work demonstrates that the heterogeneous computing architectures, with appropriate architecture specific algorithms design and optimization, have the distinct advantages of improving the performance of multimedia applications
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
Graph Fourier Transform: A Stable Approximation
In Graph Signal Processing (GSP), data dependencies are represented by a
graph whose nodes label the data and the edges capture dependencies among
nodes. The graph is represented by a weighted adjacency matrix that, in
GSP, generalizes the Discrete Signal Processing (DSP) shift operator .
The (right) eigenvectors of the shift (graph spectral components)
diagonalize and lead to a graph Fourier basis that provides a graph
spectral representation of the graph signal. The inverse of the (matrix of the)
graph Fourier basis is the Graph Fourier transform (GFT), . Often,
including in real world examples, this diagonalization is numerically unstable.
This paper develops an approach to compute an accurate approximation to and
, while insuring their numerical stability, by means of solving a non
convex optimization problem. To address the non-convexity, we propose an
algorithm, the stable graph Fourier basis algorithm (SGFA) that we prove to
exponentially increase the accuracy of the approximating per iteration.
Likewise, we can apply SGFA to and, hence, approximate the stable left
eigenvectors for the graph shift and directly compute the GFT. We evaluate
empirically the quality of SGFA by applying it to graph shifts drawn from
two real world problems, the 2004 US political blogs graph and the Manhattan
road map, carrying out a comprehensive study on tradeoffs between different
SGFA parameters. We also confirm our conclusions by applying SGFA on very
sparse and very dense directed Erd\H os-R\'enyi graphs.Comment: 16 pages, 17 figures. Originally submitted in -IEEE Transactions on
Signal Processing- on 01-Aug-2019. Resubmitted on 12-Jan-2020. Accept with
mandatory minor revisions. Resubmitted again on 30-April-202
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Data fusion strategies for energy efficiency in buildings: Overview, challenges and novel orientations
Recently, tremendous interest has been devoted to develop data fusion
strategies for energy efficiency in buildings, where various kinds of
information can be processed. However, applying the appropriate data fusion
strategy to design an efficient energy efficiency system is not
straightforward; it requires a priori knowledge of existing fusion strategies,
their applications and their properties. To this regard, seeking to provide the
energy research community with a better understanding of data fusion strategies
in building energy saving systems, their principles, advantages, and potential
applications, this paper proposes an extensive survey of existing data fusion
mechanisms deployed to reduce excessive consumption and promote sustainability.
We investigate their conceptualizations, advantages, challenges and drawbacks,
as well as performing a taxonomy of existing data fusion strategies and other
contributing factors. Following, a comprehensive comparison of the
state-of-the-art data fusion based energy efficiency frameworks is conducted
using various parameters, including data fusion level, data fusion techniques,
behavioral change influencer, behavioral change incentive, recorded data,
platform architecture, IoT technology and application scenario. Moreover, a
novel method for electrical appliance identification is proposed based on the
fusion of 2D local texture descriptors, where 1D power signals are transformed
into 2D space and treated as images. The empirical evaluation, conducted on
three real datasets, shows promising performance, in which up to 99.68%
accuracy and 99.52% F1 score have been attained. In addition, various open
research challenges and future orientations to improve data fusion based energy
efficiency ecosystems are explored
- …