69,742 research outputs found
Efficient Retrieval of Images with Irregular Patterns using Morphological Image Analysis: Applications to Industrial and Healthcare datasets
Image retrieval is the process of searching and retrieving images from a
database based on their visual content and features. Recently, much attention
has been directed towards the retrieval of irregular patterns within industrial
or medical images by extracting features from the images, such as deep
features, colour-based features, shape-based features and local features. This
has applications across a spectrum of industries, including fault inspection,
disease diagnosis, and maintenance prediction. This paper proposes an image
retrieval framework to search for images containing similar irregular patterns
by extracting a set of morphological features (DefChars) from images; the
datasets employed in this paper contain wind turbine blade images with defects,
chest computerised tomography scans with COVID-19 infection, heatsink images
with defects, and lake ice images. The proposed framework was evaluated with
different feature extraction methods (DefChars, resized raw image, local binary
pattern, and scale-invariant feature transforms) and distance metrics to
determine the most efficient parameters in terms of retrieval performance
across datasets. The retrieval results show that the proposed framework using
the DefChars and the Manhattan distance metric achieves a mean average
precision of 80% and a low standard deviation of 0.09 across classes of
irregular patterns, outperforming alternative feature-metric combinations
across all datasets. Furthermore, the low standard deviation between each class
highlights DefChars' capability for a reliable image retrieval task, even in
the presence of class imbalances or small-sized datasets.Comment: 35 pages, 5 figures, 19 tables (17 tables in appendix), submitted to
Special Issue: Advances and Challenges in Multimodal Machine Learning 2nd
Edition, Journal of Imaging, MDP
Aggregated Deep Local Features for Remote Sensing Image Retrieval
Remote Sensing Image Retrieval remains a challenging topic due to the special
nature of Remote Sensing Imagery. Such images contain various different
semantic objects, which clearly complicates the retrieval task. In this paper,
we present an image retrieval pipeline that uses attentive, local convolutional
features and aggregates them using the Vector of Locally Aggregated Descriptors
(VLAD) to produce a global descriptor. We study various system parameters such
as the multiplicative and additive attention mechanisms and descriptor
dimensionality. We propose a query expansion method that requires no external
inputs. Experiments demonstrate that even without training, the local
convolutional features and global representation outperform other systems.
After system tuning, we can achieve state-of-the-art or competitive results.
Furthermore, we observe that our query expansion method increases overall
system performance by about 3%, using only the top-three retrieved images.
Finally, we show how dimensionality reduction produces compact descriptors with
increased retrieval performance and fast retrieval computation times, e.g. 50%
faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal
contributio
A Sub-block Based Image Retrieval Using Modified Integrated Region Matching
This paper proposes a content based image retrieval (CBIR) system using the
local colour and texture features of selected image sub-blocks and global
colour and shape features of the image. The image sub-blocks are roughly
identified by segmenting the image into partitions of different configuration,
finding the edge density in each partition using edge thresholding followed by
morphological dilation. The colour and texture features of the identified
regions are computed from the histograms of the quantized HSV colour space and
Gray Level Co- occurrence Matrix (GLCM) respectively. The colour and texture
feature vectors is computed for each region. The shape features are computed
from the Edge Histogram Descriptor (EHD). A modified Integrated Region Matching
(IRM) algorithm is used for finding the minimum distance between the sub-blocks
of the query and target image. Experimental results show that the proposed
method provides better retrieving result than retrieval using some of the
existing methods.Comment: 7 page
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multimedia retrieval, with the aim of learning joint representations between different data modalities. Unfortunately, little research focuses on cross-modal correlation learning where temporal structures of different data modalities such as audio and lyrics should be taken into account. Stemming from the characteristic of temporal structures of music in nature, we are motivated to learn the deep sequential correlation between audio and lyrics. In this work, we propose a deep cross-modal correlation learning architecture involving two-branch deep neural networks for audio modality and text modality (lyrics). Data in different modalities are converted to the same canonical space where inter modal canonical correlation analysis is utilized as an objective function to calculate the similarity of temporal structures. This is the first study that uses deep architectures for learning the temporal correlation between audio and lyrics. A pre-trained Doc2Vec model followed by fully-connected layers is used to represent lyrics. Two significant contributions are made in the audio branch, as follows: i) We propose an end-to-end network to learn cross-modal correlation between audio and lyrics, where feature extraction and correlation learning are simultaneously performed and joint representation is learned by considering temporal structures. ii) As for feature extraction, we further represent an audio signal by a short sequence of local summaries (VGG16 features) and apply a recurrent neural network to compute a compact feature that better learns temporal structures of music audio. Experimental results, using audio to retrieve lyrics or using lyrics to retrieve audio, verify the effectiveness of the proposed deep correlation learning architectures in cross-modal music retrieval
- …