74 research outputs found
Intelligent Image Retrieval Techniques: A Survey
AbstractIn the current era of digital communication, the use of digital images has increased for expressing, sharing and interpreting information. While working with digital images, quite often it is necessary to search for a specific image for a particular situation based on the visual contents of the image. This task looks easy if you are dealing with tens of images but it gets more difficult when the number of images goes from tens to hundreds and thousands, and the same content-based searching task becomes extremely complex when the number of images is in the millions. To deal with the situation, some intelligent way of content-based searching is required to fulfill the searching request with right visual contents in a reasonable amount of time. There are some really smart techniques proposed by researchers for efficient and robust content-based image retrieval. In this research, the aim is to highlight the efforts of researchers who conducted some brilliant work and to provide a proof of concept for intelligent content-based image retrieval techniques
Recommended from our members
Fast embedding for image classification & retrieval and its application to the hostel industry
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonContent-based image classification and retrieval are the automatic processes of taking
an unseen image input and extracting its features representing the input image. Then,
for the classification task, this mathematically measured input is categorized according
to established criteria in the server and consequently shows the output as a result. On
the other hand, for the retrieval task, the extracted features of an unseen query image
are sent to the server to search for the most visually similar images to a given image
and retrieve these images as a result. Despite image features could be represented
by classical features, artificial intelligence-based features, Convolutional Neural
Networks (CNN) to be precise, have become powerful tools in the field. Nonetheless,
the high dimensional CNN features have been a challenge in particular for applications
on mobile or Internet of Things devices. Therefore, in this thesis, several fast
embeddings are explored and proposed to overcome the constraints of low memory,
bandwidth, and power. Furthermore, the first hostel image database is created with
three datasets, hostel image dataset containing 13,908 interior and exterior images of
hostels across the world, and Hostels-900 dataset and Hostels-2K dataset containing
972 images and 2,380 images, respectively, of 20 London hostel buildings. The results
demonstrate that the proposed fast embeddings such as the application of GHM-Rand
operator, GHM-Fix operator, and binary feature vectors are able to outperform or give
competitive results to those state-of-the-art methods with a lot less computational
resource. Additionally, the findings from a ten-year literature review of CBIR study in
the tourism industry could picturize the relevant research activities in the past decade
which are not only beneficial to the hostel industry or tourism sector but also to the
computer science and engineering research communities for the potential real-life
applications of the existing and developing technologies in the field
A graph-based approach for the retrieval of multi-modality medical images
Medical imaging has revolutionised modern medicine and is now an integral aspect of diagnosis and patient monitoring. The development of new imaging devices for a wide variety of clinical cases has spurred an increase in the data volume acquired in hospitals. These large data collections offer opportunities for search-based applications in evidence-based diagnosis, education, and biomedical research. However, conventional search methods that operate upon manual annotations are not feasible for this data volume. Content-based image retrieval (CBIR) is an image search technique that uses automatically derived visual features as search criteria and has demonstrable clinical benefits. However, very few studies have investigated the CBIR of multi-modality medical images, which are making a monumental impact in healthcare, e.g., combined positron emission tomography and computed tomography (PET-CT) for cancer diagnosis. In this thesis, we propose a new graph-based method for the CBIR of multi-modality medical images. We derive a graph representation that emphasises the spatial relationships between modalities by structurally constraining the graph based on image features, e.g., spatial proximity of tumours and organs. We also introduce a graph similarity calculation algorithm that prioritises the relationships between tumours and related organs. To enable effective human interpretation of retrieved multi-modality images, we also present a user interface that displays graph abstractions alongside complex multi-modality images. Our results demonstrated that our method achieved a high precision when retrieving images on the basis of tumour location within organs. The evaluation of our proposed UI design by user surveys revealed that it improved the ability of users to interpret and understand the similarity between retrieved PET-CT images. The work in this thesis advances the state-of-the-art by enabling a novel approach for the retrieval of multi-modality medical images
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
A holistic approach to structure from motion
This dissertation investigates the general structure from motion problem. That is, how to compute in an unconstrained environment 3D scene structure, camera motion and moving objects from video sequences. We present a framework which uses concatenated feed-back loops to overcome the main difficulty in the structure from motion problem: the chicken-and-egg dilemma between scene segmentation and structure recovery. The idea is that we compute structure and motion in stages by gradually computing 3D scene information of increasing complexity and using processes which operate on increasingly large spatial image areas. Within this framework, we developed three modules. First, we introduce a new constraint for the estimation of shape using image features from multiple views. We analyze this constraint and show that noise leads to unavoidable mis-estimation of the shape, which also predicts the erroneous shape perception in human. This insight provides a clear argument for the need for feed-back loops. Second, a novel constraint on shape is developed which allows us to connect multiple frames in the estimation of camera motion by matching only small image patches. Third, we present a texture descriptor for matching areas of extended sizes. The advantage of this texture descriptor, which is based on fractal geometry, lies in its invariance to any smooth mapping (Bi-Lipschitz transform) including changes of viewpoint, illumination and surface distortion. Finally, we apply our framework to the problem of super-resolution imaging. We use the 3D motion estimation together with a novel wavelet-based reconstruction scheme to reconstruct a high-resolution image from a sequence of low-resolution images
Recommended from our members
Interactive Imaging via Hand Gesture Recognition.
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. As a sub-field, Interactive Image Processing emphasizes particularly on the communications between machine and human. The basic flowchart is definition of object, analysis and training phase, recognition and feedback. Generally speaking, the core issue is how we define the interesting object and track them more accurately in order to complete the interaction process successfully.
This thesis proposes a novel dynamic simulation scheme for interactive image processing. The work consists of two main parts: Hand Motion Detection and Hand Gesture recognition. Within a hand motion detection processing, movement of hand will be identified and extracted. In a specific detection period, the current image is compared with the previous image in order to generate the difference between them. If the generated difference exceeds predefined threshold alarm, a typical hand motion movement is detected. Furthermore, in some particular situations, changes of hand gesture are also desired to be detected and classified. This task requires features extraction and feature comparison among each type of gestures. The essentials of hand gesture are including some low level features such as color, shape etc. Another important feature is orientation histogram. Each type of hand gestures has its particular representation in the domain of orientation histogram. Because Gaussian Mixture Model has great advantages to represent the object with essential feature elements and the Expectation-Maximization is the efficient procedure to compute the maximum likelihood between testing images and predefined standard sample of each different gesture, the comparability between testing image and samples of each type of gestures will be estimated by Expectation-Maximization algorithm in Gaussian Mixture Model. The performance of this approach in experiments shows the proposed method works well and accurately
Similarity reasoning for local surface analysis and recognition
This thesis addresses the similarity assessment of digital shapes, contributing to the analysis of surface characteristics that are independent of the global shape but are crucial to identify a model as belonging to the same manufacture, the same origin/culture or the same typology (color, common decorations, common feature elements, compatible style elements, etc.). To face this problem, the interpretation of the local surface properties is crucial.
We go beyond the retrieval of models or surface patches in a collection of models, facing the recognition of geometric patterns across digital models with different overall shape. To address this challenging problem, the use of both engineered and learning-based descriptions are investigated, building one of the first contributions towards the localization and identification of geometric patterns on digital surfaces. Finally, the recognition of patterns adds a further perspective in the exploration of (large) 3D data collections, especially in the cultural heritage domain.
Our work contributes to the definition of methods able to locally characterize the geometric and colorimetric surface decorations. Moreover, we showcase our benchmarking activity carried out in recent years on the identification of geometric features and the retrieval of digital models completely characterized by geometric or colorimetric patterns
- …