289 research outputs found
Low-Rank Hypergraph Hashing for Large-Scale Remote Sensing Image Retrieval
[EN] As remote sensing (RS) images increase dramatically, the demand for remote sensing image retrieval (RSIR) is growing, and has received more and more attention. The characteristics of RS images, e.g., large volume, diversity and high complexity, make RSIR more challenging in terms of speed and accuracy. To reduce the retrieval complexity of RSIR, a hashing technique has been widely used for RSIR, mapping high-dimensional data into a low-dimensional Hamming space while preserving the similarity structure of data. In order to improve hashing performance, we propose a new hash learning method, named low-rank hypergraph hashing (LHH), to accomplish for the large-scale RSIR task. First, LHH employs a l(2-1) norm to constrain the projection matrix to reduce the noise and redundancy among features. In addition, low-rankness is also imposed on the projection matrix to exploit its global structure. Second, LHH uses hypergraphs to capture the high-order relationship among data, and is very suitable to explore the complex structure of RS images. Finally, an iterative algorithm is developed to generate high-quality hash codes and efficiently solve the proposed optimization problem with a theoretical convergence guarantee. Extensive experiments are conducted on three RS image datasets and one natural image dataset that are publicly available. The experimental results demonstrate that the proposed LHH outperforms the existing hashing learning in RSIR tasks.This research was supported in part by the Natural Science Foundation of China under Grant 61673220.Kong, J.; Sun, Q.; Mukherjee, M.; Lloret, J. (2020). Low-Rank Hypergraph Hashing for Large-Scale Remote Sensing Image Retrieval. Remote Sensing. 12(7):1-19. https://doi.org/10.3390/rs1207116411912
Impact of Feature Representation on Remote Sensing Image Retrieval
Remote sensing images are acquired using special platforms, sensors and are classified as aerial, multispectral and hyperspectral images. Multispectral and hyperspectral images are represented using large spectral vectors as compared to normal Red, Green, Blue (RGB) images. Hence, remote sensing image retrieval process from large archives is a challenging task. Remote sensing image retrieval mainly consist of feature representation as first step and finding out similar images to a query image as second step. Feature representation plays important part in the performance of remote sensing image retrieval process. Research work focuses on impact of feature representation of remote sensing images on the performance of remote sensing image retrieval. This study shows that more discriminative features of remote sensing images are needed to improve performance of remote sensing image retrieval process
An Unsupervised Multicode Hashing Method for Accurate and Scalable Remote Sensing Image Retrieval
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Hashing methods have recently attracted great attention for approximate nearest neighbor search in massive remote sensing (RS) image archives due to their computational and storage effectiveness. The existing hashing methods in RS represent each image with a single-hash code that is usually obtained by applying hash functions to global image representations. Such an approach may not optimally represent the complex information content of RS images. To overcome this problem, in this letter, we present a simple yet effective unsupervised method that represents each image with primitive-cluster sensitive multi-hash codes (each of which corresponds to a primitive present in the image). To this end, the proposed method consists of two main steps: 1) characterization of images by descriptors of primitive-sensitive clusters and 2) definition of multi-hash codes from the descriptors of the primitive-sensitive clusters. After obtaining multi-hash codes for each image, retrieval of images is achieved based on a multi-hash-code-matching scheme. Any hashing method that provides single-hash code can be embedded within the proposed method to provide primitive-sensitive multi-hash codes. Compared with state-of-the-art single-code hashing methods in RS, the proposed method achieves higher retrieval accuracy under the same retrieval time, and thus it is more efficient for operational applications.EC/H2020/759764/EU/Accurate and Scalable Processing of Big Data in Earth Observation/BigEart
Deep Hashing Based on Class-Discriminated Neighborhood Embedding
Deep-hashing methods have drawn significant attention during the past years in the field of remote sensing (RS)
owing to their prominent capabilities for capturing the semantics
from complex RS scenes and generating the associated hash codes
in an end-to-end manner. Most existing deep-hashing methods
exploit pairwise and triplet losses to learn the hash codes with
the preservation of semantic-similarities which require the construction of image pairs and triplets based on supervised information (e.g., class labels). However, the learned Hamming spaces
based on these losses may not be optimal due to an insufficient
sampling of image pairs and triplets for scalable RS archives. To
solve this limitation, we propose a new deep-hashing technique
based on the class-discriminated neighborhood embedding, which
can properly capture the locality structures among the RS scenes
and distinguish images class-wisely in the Hamming space. An
extensive experimentation has been conducted in order to validate
the effectiveness of the proposed method by comparing it with
several state-of-the-art conventional and deep-hashing methods.
The related codes of this article will be made publicly available for
reproducible research by the community
Recommended from our members
Fast embedding for image classification & retrieval and its application to the hostel industry
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonContent-based image classification and retrieval are the automatic processes of taking
an unseen image input and extracting its features representing the input image. Then,
for the classification task, this mathematically measured input is categorized according
to established criteria in the server and consequently shows the output as a result. On
the other hand, for the retrieval task, the extracted features of an unseen query image
are sent to the server to search for the most visually similar images to a given image
and retrieve these images as a result. Despite image features could be represented
by classical features, artificial intelligence-based features, Convolutional Neural
Networks (CNN) to be precise, have become powerful tools in the field. Nonetheless,
the high dimensional CNN features have been a challenge in particular for applications
on mobile or Internet of Things devices. Therefore, in this thesis, several fast
embeddings are explored and proposed to overcome the constraints of low memory,
bandwidth, and power. Furthermore, the first hostel image database is created with
three datasets, hostel image dataset containing 13,908 interior and exterior images of
hostels across the world, and Hostels-900 dataset and Hostels-2K dataset containing
972 images and 2,380 images, respectively, of 20 London hostel buildings. The results
demonstrate that the proposed fast embeddings such as the application of GHM-Rand
operator, GHM-Fix operator, and binary feature vectors are able to outperform or give
competitive results to those state-of-the-art methods with a lot less computational
resource. Additionally, the findings from a ten-year literature review of CBIR study in
the tourism industry could picturize the relevant research activities in the past decade
which are not only beneficial to the hostel industry or tourism sector but also to the
computer science and engineering research communities for the potential real-life
applications of the existing and developing technologies in the field
Energy Considerations in Blockchain-Enabled Applications
Blockchain-powered smart systems deployed in different industrial applications promise operational efficiencies and improved yields, while mitigating significant cybersecurity risks pertaining to the main application. Associated tradeoffs between availability and security arise at implementation, however, triggered by the additional resources (e.g., memory, computation) required by each blockchain-enabled host. This thesis applies an energy-reducing algorithmic engineering technique for Merkle Tree root and Proof of Work calculations, two principal elements of blockchain computations, as a means to preserve the promised security benefits but with less compromise to system availability. Using pyRAPL, a python library to measure computational energy, we experiment with both the standard and energy-reduced implementations of the Merkle Tree for different input sizes (in bytes) and of the Proof of Work for different difficulty levels. Our results show up to 98\% reduction in energy consumption is possible within the blockchain\u27s Merkle Tree construction module, such reductions typically increasing with larger input sizes. For Proof-of-Work calculations, our results show an average energy reduction of 20\% across typical difficulty levels. The proposed energy-reducing technique is potentially applicable to other key elements of blockchain computations, potentially affording even greener blockchain-powered systems than implied by only the Merkle Tree and Proof of Work results obtained thus far
A Survey on Evolutionary Computation for Computer Vision and Image Analysis: Past, Present, and Future Trends
Computer vision (CV) is a big and important field
in artificial intelligence covering a wide range of applications.
Image analysis is a major task in CV aiming to extract, analyse
and understand the visual content of images. However, imagerelated
tasks are very challenging due to many factors, e.g., high
variations across images, high dimensionality, domain expertise
requirement, and image distortions. Evolutionary computation
(EC) approaches have been widely used for image analysis with
significant achievement. However, there is no comprehensive
survey of existing EC approaches to image analysis. To fill
this gap, this paper provides a comprehensive survey covering
all essential EC approaches to important image analysis tasks
including edge detection, image segmentation, image feature
analysis, image classification, object detection, and others. This
survey aims to provide a better understanding of evolutionary
computer vision (ECV) by discussing the contributions of different
approaches and exploring how and why EC is used for
CV and image analysis. The applications, challenges, issues, and
trends associated to this research field are also discussed and
summarised to provide further guidelines and opportunities for
future research
Image-set, Temporal and Spatiotemporal Representations of Videos for Recognizing, Localizing and Quantifying Actions
This dissertation addresses the problem of learning video representations, which is defined here as transforming the video so that its essential structure is made more visible or accessible for action recognition and quantification. In the literature, a video can be represented by a set of images, by modeling motion or temporal dynamics, and by a 3D graph with pixels as nodes. This dissertation contributes in proposing a set of models to localize, track, segment, recognize and assess actions such as (1) image-set models via aggregating subset features given by regularizing normalized CNNs, (2) image-set models via inter-frame principal recovery and sparsely coding residual actions, (3) temporally local models with spatially global motion estimated by robust feature matching and local motion estimated by action detection with motion model added, (4) spatiotemporal models 3D graph and 3D CNN to model time as a space dimension, (5) supervised hashing by jointly learning embedding and quantization, respectively. State-of-the-art performances are achieved for tasks such as quantifying facial pain and human diving. Primary conclusions of this dissertation are categorized as follows: (i) Image set can capture facial actions that are about collective representation; (ii) Sparse and low-rank representations can have the expression, identity and pose cues untangled and can be learned via an image-set model and also a linear model; (iii) Norm is related with recognizability; similarity metrics and loss functions matter; (v) Combining the MIL based boosting tracker with the Particle Filter motion model induces a good trade-off between the appearance similarity and motion consistence; (iv) Segmenting object locally makes it amenable to assign shape priors; it is feasible to learn knowledge such as shape priors online from Web data with weak supervision; (v) It works locally in both space and time to represent videos as 3D graphs; 3D CNNs work effectively when inputted with temporally meaningful clips; (vi) the rich labeled images or videos help to learn better hash functions after learning binary embedded codes than the random projections. In addition, models proposed for videos can be adapted to other sequential images such as volumetric medical images which are not included in this dissertation
- …