13,641 research outputs found
Blind Hyperspectral-Multispectral Image Fusion via Graph Laplacian Regularization
Fusing a low-resolution hyperspectral image (HSI) and a high-resolution
multispectral image (MSI) of the same scene leads to a super-resolution image
(SRI), which is information rich spatially and spectrally. In this paper, we
super-resolve the HSI using the graph Laplacian defined on the MSI. Unlike many
existing works, we don't assume prior knowledge about the spatial degradation
from SRI to HSI, nor a perfectly aligned HSI and MSI pair. Our algorithm
progressively alternates between finding the blur kernel and fusing HSI with
MSI, generating accurate estimations of the blur kernel and the SRI at
convergence. Experiments on various datasets demonstrate the advantages of the
proposed algorithm in the quality of fusion and its capability in dealing with
unknown spatial degradation
Cutting tool tracking and recognition based on infrared and visual imaging systems using principal component analysis (PCA) and discrete wavelet transform (DWT) combined with neural networks
The implementation of computerised condition monitoring systems for the detection cutting tools’ correct installation and fault diagnosis is of a high importance in modern manufacturing industries. The primary function of a condition monitoring system is to check the existence of the tool before starting any machining process and ensure its health during operation. The aim of this study is to assess the detection of the existence of the tool in the spindle and its health (i.e. normal or broken) using
infrared and vision systems as a non-contact methodology. The application of Principal Component Analysis (PCA) and Discrete Wavelet Transform (DWT) combined with neural networks are investigated using both types of data in order to establish an effective and reliable novel software program for tool tracking and health recognition. Infrared and visual cameras are used to locate and track the cutting tool during the machining process using a suitable analysis and image processing algorithms. The capabilities of PCA and Discrete Wavelet Transform (DWT) combined with neural networks are investigated in recognising the tool’s condition by comparing the characteristics of the tool to those of known conditions in the training set. The experimental results have shown high performance when using the infrared data in comparison to visual images for the selected image and signal processing algorithms
Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
A major emerging challenge is how to protect people's privacy as cameras and
computer vision are increasingly integrated into our daily lives, including in
smart devices inside homes. A potential solution is to capture and record just
the minimum amount of information needed to perform a task of interest. In this
paper, we propose a fully-coupled two-stream spatiotemporal architecture for
reliable human action recognition on extremely low resolution (e.g., 12x16
pixel) videos. We provide an efficient method to extract spatial and temporal
features and to aggregate them into a robust feature representation for an
entire action video sequence. We also consider how to incorporate high
resolution videos during training in order to build better low resolution
action recognition models. We evaluate on two publicly-available datasets,
showing significant improvements over the state-of-the-art.Comment: 9 pagers, 5 figures, published in WACV 201
Structure Tensor Based Image Interpolation Method
Feature preserving image interpolation is an active area in image processing
field. In this paper a new direct edge directed image super-resolution
algorithm based on structure tensors is proposed. Using an isotropic Gaussian
filter, the structure tensor at each pixel of the input image is computed and
the pixels are classified to three distinct classes; uniform region, corners
and edges, according to the eigenvalues of the structure tensor. Due to
application of the isotropic Gaussian filter, the classification is robust to
noise presented in image. Based on the tangent eigenvector of the structure
tensor, the edge direction is determined and used for interpolation along the
edges. In comparison to some previous edge directed image interpolation
methods, the proposed method achieves higher quality in both subjective and
objective aspects. Also the proposed method outperforms previous methods in
case of noisy and JPEG compressed images. Furthermore, without the need for
optimization in the process, the algorithm can achieve higher speed.Comment: Accepted for publication in AEU - International Journal of
Electronics and Communication
Director Field Analysis (DFA): Exploring Local White Matter Geometric Structure in diffusion MRI
In Diffusion Tensor Imaging (DTI) or High Angular Resolution Diffusion
Imaging (HARDI), a tensor field or a spherical function field (e.g., an
orientation distribution function field), can be estimated from measured
diffusion weighted images. In this paper, inspired by the microscopic
theoretical treatment of phases in liquid crystals, we introduce a novel
mathematical framework, called Director Field Analysis (DFA), to study local
geometric structural information of white matter based on the reconstructed
tensor field or spherical function field: 1) We propose a set of mathematical
tools to process general director data, which consists of dyadic tensors that
have orientations but no direction. 2) We propose Orientational Order (OO) and
Orientational Dispersion (OD) indices to describe the degree of alignment and
dispersion of a spherical function in a single voxel or in a region,
respectively; 3) We also show how to construct a local orthogonal coordinate
frame in each voxel exhibiting anisotropic diffusion; 4) Finally, we define
three indices to describe three types of orientational distortion (splay, bend,
and twist) in a local spatial neighborhood, and a total distortion index to
describe distortions of all three types. To our knowledge, this is the first
work to quantitatively describe orientational distortion (splay, bend, and
twist) in general spherical function fields from DTI or HARDI data. The
proposed DFA and its related mathematical tools can be used to process not only
diffusion MRI data but also general director field data, and the proposed
scalar indices are useful for detecting local geometric changes of white matter
for voxel-based or tract-based analysis in both DTI and HARDI acquisitions. The
related codes and a tutorial for DFA will be released in DMRITool.Comment: Accepted by Medical Image Analysi
Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene Labeling
This paper proposes a new method called Multimodal RNNs for RGB-D scene
semantic segmentation. It is optimized to classify image pixels given two input
sources: RGB color channels and Depth maps. It simultaneously performs training
of two recurrent neural networks (RNNs) that are crossly connected through
information transfer layers, which are learnt to adaptively extract relevant
cross-modality features. Each RNN model learns its representations from its own
previous hidden states and transferred patterns from the other RNNs previous
hidden states; thus, both model-specific and crossmodality features are
retained. We exploit the structure of quad-directional 2D-RNNs to model the
short and long range contextual information in the 2D input image. We carefully
designed various baselines to efficiently examine our proposed model structure.
We test our Multimodal RNNs method on popular RGB-D benchmarks and show how it
outperforms previous methods significantly and achieves competitive results
with other state-of-the-art works.Comment: 15 pages, 13 figures, IEEE TMM 201
Robust Face Recognition with Structural Binary Gradient Patterns
This paper presents a computationally efficient yet powerful binary framework
for robust facial representation based on image gradients. It is termed as
structural binary gradient patterns (SBGP). To discover underlying local
structures in the gradient domain, we compute image gradients from multiple
directions and simplify them into a set of binary strings. The SBGP is derived
from certain types of these binary strings that have meaningful local
structures and are capable of resembling fundamental textural information. They
detect micro orientational edges and possess strong orientation and locality
capabilities, thus enabling great discrimination. The SBGP also benefits from
the advantages of the gradient domain and exhibits profound robustness against
illumination variations. The binary strategy realized by pixel correlations in
a small neighborhood substantially simplifies the computational complexity and
achieves extremely efficient processing with only 0.0032s in Matlab for a
typical face image. Furthermore, the discrimination power of the SBGP can be
enhanced on a set of defined orientational image gradient magnitudes, further
enforcing locality and orientation. Results of extensive experiments on various
benchmark databases illustrate significant improvements of the SBGP based
representations over the existing state-of-the-art local descriptors in the
terms of discrimination, robustness and complexity. Codes for the SBGP methods
will be available at
http://www.eee.manchester.ac.uk/research/groups/sisp/software/
A Survey on Object Detection in Optical Remote Sensing Images
Object detection in optical remote sensing images, being a fundamental but
challenging problem in the field of aerial and satellite image analysis, plays
an important role for a wide range of applications and is receiving significant
attention in recent years. While enormous methods exist, a deep review of the
literature concerning generic object detection is still lacking. This paper
aims to provide a review of the recent progress in this field. Different from
several previously published surveys that focus on a specific object class such
as building and road, we concentrate on more generic object categories
including, but are not limited to, road, building, tree, vehicle, ship,
airport, urban-area. Covering about 270 publications we survey 1) template
matching-based object detection methods, 2) knowledge-based object detection
methods, 3) object-based image analysis (OBIA)-based object detection methods,
4) machine learning-based object detection methods, and 5) five publicly
available datasets and three standard evaluation metrics. We also discuss the
challenges of current studies and propose two promising research directions,
namely deep learning-based feature representation and weakly supervised
learning-based geospatial object detection. It is our hope that this survey
will be beneficial for the researchers to have better understanding of this
research field.Comment: This manuscript is the accepted version for ISPRS Journal of
Photogrammetry and Remote Sensin
Generic 3D Convolutional Fusion for image restoration
Also recently, exciting strides forward have been made in the area of image
restoration, particularly for image denoising and single image
super-resolution. Deep learning techniques contributed to this significantly.
The top methods differ in their formulations and assumptions, so even if their
average performance may be similar, some work better on certain image types and
image regions than others. This complementarity motivated us to propose a novel
3D convolutional fusion (3DCF) method. Unlike other methods adapted to
different tasks, our method uses the exact same convolutional network
architecture to address both image denois- ing and single image
super-resolution. As a result, our 3DCF method achieves substantial
improvements (0.1dB-0.4dB PSNR) over the state-of-the-art methods that it
fuses, and this on standard benchmarks for both tasks. At the same time, the
method still is computationally efficient
Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm
In this paper, a new texture descriptor named "Fractional Local Neighborhood
Intensity Pattern" (FLNIP) has been proposed for content based image retrieval
(CBIR). It is an extension of the Local Neighborhood Intensity Pattern
(LNIP)[1]. FLNIP calculates the relative intensity difference between a
particular pixel and the center pixel of a 3x3 window by considering the
relationship with adjacent neighbors. In this work, the fractional change in
the local neighborhood involving the adjacent neighbors has been calculated
first with respect to one of the eight neighbors of the center pixel of a 3x3
window. Next, the fractional change has been calculated with respect to the
center itself. The two values of fractional change are next compared to
generate a binary bit pattern. Both sign and magnitude information are encoded
in a single descriptor as it deals with the relative change in magnitude in the
adjacent neighborhood i.e., the comparison of the fractional change. The
descriptor is applied on four multi-resolution images -- one being the raw
image and the other three being filtered gaussian images obtained by applying
gaussian filters of different standard deviations on the raw image to signify
the importance of exploring texture information at different resolutions in an
image. The four sets of distances obtained between the query and the target
image are then combined with a genetic algorithm based approach to improve the
retrieval performance by minimizing the distance between similar class images.
The performance of the method has been tested for image retrieval on four
popular databases. The precision and recall values observed on these databases
have been compared with recent state-of-art local patterns. The proposed method
has shown a significant improvement over many other existing methods.Comment: MTAP, Springer(Minor Revision
- …