377 research outputs found
Intelligent Image Retrieval Techniques: A Survey
AbstractIn the current era of digital communication, the use of digital images has increased for expressing, sharing and interpreting information. While working with digital images, quite often it is necessary to search for a specific image for a particular situation based on the visual contents of the image. This task looks easy if you are dealing with tens of images but it gets more difficult when the number of images goes from tens to hundreds and thousands, and the same content-based searching task becomes extremely complex when the number of images is in the millions. To deal with the situation, some intelligent way of content-based searching is required to fulfill the searching request with right visual contents in a reasonable amount of time. There are some really smart techniques proposed by researchers for efficient and robust content-based image retrieval. In this research, the aim is to highlight the efforts of researchers who conducted some brilliant work and to provide a proof of concept for intelligent content-based image retrieval techniques
Retrieving Encrypted Images Using Convolution Neural Network and Fully Homomorphic Encryption
استرجاع الصور المستند إلى المحتوى (CBIR) هو تقنية تستخدم لاسترداد الصور من قاعدة بيانات الصور. ومع ذلك، فإن عملية CBIR تعاني من دقة أقل في استرداد الصور من قاعدة بيانات صور واسعة النطاق وضمان خصوصية الصور. تهدف هذه الورقة إلى معالجة قضايا الدقة باستخدام تقنيات التعلم العميق كطريقة CNN. أيضًا، توفير الخصوصية اللازمة للصور باستخدام طرق تشفير متماثلة تمامًا بواسطة Cheon و Kim و Kim و Song (CKKS). ولتحقيق هذه الأهداف تم اقتراح نظام RCNN_CKKS يتضمن جزأين. يستخرج الجزء الأول (المعالجة دون اتصال بالإنترنت–) لاستخراج الخصائص العالية المستوى استنادًا إلى طبقة التسطيح في شبكة عصبية تلافيفية (CNN) ثم يخزن هذه الميزات في مجموعة بيانات جديدة. في الجزء الثاني (المعالجة عبر الإنترنت) ، يرسل العميل الصورة المشفرة إلى الخادم ، والتي تعتمد على نموذج CNN المدرب لاستخراج ميزات الصورة المرسلة. بعد ذلك، تتم مقارنة الميزات المستخرجة مع الميزات المخزنة باستخدام طريقة Hamming Distance لاسترداد جميع الصور المتشابهة. أخيرًا، يقوم الخادم بتشفير جميع الصور المسترجعة وإرسالها إلى العميل. كانت نتائج التعلم العميق على الصور العادية 97.94٪ للتصنيف و98.94٪ للصور المسترجعة. في الوقت نفسه، تم استخدام اختبار NIST للتحقق من أمان CKKS عند تطبيقه على مجموعة بيانات المعهد الكندي للأبحاث المتقدمة (CIFAR-10). من خلال هذه النتائج، استنتج الباحثون أن التعلم العميق هو وسيلة فعالة لاستعادة الصور وأن طريقة CKKS مناسبة لحماية خصوصية الصورة.A content-based image retrieval (CBIR) is a technique used to retrieve images from an image database. However, the CBIR process suffers from less accuracy to retrieve images from an extensive image database and ensure the privacy of images. This paper aims to address the issues of accuracy utilizing deep learning techniques as the CNN method. Also, it provides the necessary privacy for images using fully homomorphic encryption methods by Cheon, Kim, Kim, and Song (CKKS). To achieve these aims, a system has been proposed, namely RCNN_CKKS, that includes two parts. The first part (offline processing) extracts automated high-level features based on a flatting layer in a convolutional neural network (CNN) and then stores these features in a new dataset. In the second part (online processing), the client sends the encrypted image to the server, which depends on the CNN model trained to extract features of the sent image. Next, the extracted features are compared with the stored features using a Hamming distance method to retrieve all similar images. Finally, the server encrypts all retrieved images and sends them to the client. Deep-learning results on plain images were 97.94% for classification and 98.94% for retriever images. At the same time, the NIST test was used to check the security of CKKS when applied to Canadian Institute for Advanced Research (CIFAR-10) dataset. Through these results, researchers conclude that deep learning is an effective method for image retrieval and that a CKKS method is appropriate for image privacy protection
Image Retrieval Method Combining Bayes and SVM Classifier Based on Relevance Feedback with Application to Small-scale Datasets
A vast amount of images has been generated due to the diversity and digitalization of devices for image acquisition. However, the gap between low-level visual features and high-level semantic representations has been a major concern that hinders retrieval accuracy. A retrieval method based on the transfer learning model and the relevance feedback technique was formulated in this study to optimize the dynamic trade-off between the structural complexity and retrieval performance of the small- and medium-scale content-based image retrieval (CBIR) system. First, the pretrained deep learning model was fine-tuned to extract features from target datasets. Then, the target dataset was clustered into the relative and irrelative image library by exploring the Bayes classifier. Next, the support vector machine (SVM) classifier was used to retrieve similar images in the relative library. Finally, the relevance feedback technique was employed to update the parameters of both classifiers iteratively until the request for the retrieval was met. Results demonstrate that the proposed method achieves 95.87% in classification index F1 - Score, which surpasses that of the suboptimal approach DCNN-BSVM by 6.76%. The performance of the proposed method is superior to that of other approaches considering retrieval criteria as average precision, average recall, and mean average precision. The study indicates that the Bayes + SVM combined classifier accomplishes the optimal quantities more efficiently than only either Bayes or SVM classifier under the transfer learning framework. Transfer learning skillfully excels training from scratch considering the feature extraction modes. This study provides a certain reference for other insights on applications of small- and medium-scale CBIR systems with inadequate samples
Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services
Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced
services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers
An Intelligent Multi-Resolutional and Rotational Invariant Texture Descriptor for Image Retrieval Systems
To find out the identical or comparable images from the large rotated databases with higher retrieval accuracy and lesser time is the challenging task in Content based Image Retrieval systems (CBIR). Considering this problem, an intelligent and efficient technique is proposed for texture based images. In this method, firstly a new joint feature vector is created which inherits the properties of Local binary pattern (LBP) which has steadiness regarding changes in illumination and rotation and discrete wavelet transform (DWT) which is multi-resolutional and multi-oriented along with higher directionality. Secondly, after the creation of hybrid feature vector, to increase the accuracy of the system, classifiers are employed on the combination of LBP and DWT. The performance of two machine learning classifiers is proposed here which are Support Vector Machine (SVM) and Extreme learning machine (ELM). Both proposed methods P1 (LBP+DWT+SVM) and P2 (LBP+DWT+ELM) are tested on rotated Brodatz dataset consisting of 1456 texture images and MIT VisTex dataset of 640 images. In both experiments the results of both the proposed methods are much better than simple combination of DWT +LBP and much other state of art methods in terms of precision and accuracy when different number of images is retrieved. But the results obtained by ELM algorithm shows some more improvement than SVM. Such as when top 25 images are retrieved then in case of Brodatz database the precision is up to 94% and for MIT VisTex database its value is up to 96% with ELM classifier which is very much superior to other existing texture retrieval methods
Image Information Retrieval based on Edge Responses, Shape and Texture Features using Datamining Techniques
The present paper proposes a new technique that extracts significant structural, texture and local edge features from images. The local features are extracted by a steady local edge response that can sustain the presence of noise, illumination changes. The local edge response image is converted in to a ternary pattern image based on a local threshold. The structural features are derived by extracting shapes in the form of textons. The texture features are derived by constructing grey level co-occurrence matrix (GLCM) on the derived texton image. A new variant of K-means clustering scheme is proposed for clustering of images. The proposed method is compared with various methods of image retrieval based on data mining techniques. The experimental results on Wang dataset shows the efficacy of the proposed method over the other methods
Image retrieval : a first step for a human centered approach
International audienceImage indexing using content analysis is known as a difficult task, involving the vision research domain. Using these tools in the context of a retrieval system is generally frustrating for users, due to a lack of interfaces development, and to the difficulty for users to understand the low-level features managed by the system. We propose in this paper a general point of view for introducing a link between such systems and potential users. This includes image features based on visual perception models, a relevance feedback model, and a graphical interface to express the information need through user-system interaction
Multi modal multi-semantic image retrieval
PhDThe rapid growth in the volume of visual information, e.g. image, and video can
overwhelm users’ ability to find and access the specific visual information of interest
to them. In recent years, ontology knowledge-based (KB) image information retrieval
techniques have been adopted into in order to attempt to extract knowledge from these
images, enhancing the retrieval performance. A KB framework is presented to
promote semi-automatic annotation and semantic image retrieval using multimodal
cues (visual features and text captions). In addition, a hierarchical structure for the KB
allows metadata to be shared that supports multi-semantics (polysemy) for concepts.
The framework builds up an effective knowledge base pertaining to a domain specific
image collection, e.g. sports, and is able to disambiguate and assign high level
semantics to ‘unannotated’ images.
Local feature analysis of visual content, namely using Scale Invariant Feature
Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’
model (BVW) as an effective method to represent visual content information and to
enhance its classification and retrieval. Local features are more useful than global
features, e.g. colour, shape or texture, as they are invariant to image scale, orientation
and camera angle. An innovative approach is proposed for the representation,
annotation and retrieval of visual content using a hybrid technique based upon the use
of an unstructured visual word and upon a (structured) hierarchical ontology KB
model. The structural model facilitates the disambiguation of unstructured visual
words and a more effective classification of visual content, compared to a vector
space model, through exploiting local conceptual structures and their relationships.
The key contributions of this framework in using local features for image
representation include: first, a method to generate visual words using the semantic
local adaptive clustering (SLAC) algorithm which takes term weight and spatial
locations of keypoints into account. Consequently, the semantic information is
preserved. Second a technique is used to detect the domain specific ‘non-informative
visual words’ which are ineffective at representing the content of visual data and
degrade its categorisation ability. Third, a method to combine an ontology model with
xi
a visual word model to resolve synonym (visual heterogeneity) and polysemy
problems, is proposed. The experimental results show that this approach can discover
semantically meaningful visual content descriptions and recognise specific events,
e.g., sports events, depicted in images efficiently.
Since discovering the semantics of an image is an extremely challenging problem, one
promising approach to enhance visual content interpretation is to use any associated
textual information that accompanies an image, as a cue to predict the meaning of an
image, by transforming this textual information into a structured annotation for an
image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct
types of information representation and modality, there are some strong, invariant,
implicit, connections between images and any accompanying text information.
Semantic analysis of image captions can be used by image retrieval systems to
retrieve selected images more precisely. To do this, a Natural Language Processing
(NLP) is exploited firstly in order to extract concepts from image captions. Next, an
ontology-based knowledge model is deployed in order to resolve natural language
ambiguities. To deal with the accompanying text information, two methods to extract
knowledge from textual information have been proposed. First, metadata can be
extracted automatically from text captions and restructured with respect to a semantic
model. Second, the use of LSI in relation to a domain-specific ontology-based
knowledge model enables the combined framework to tolerate ambiguities and
variations (incompleteness) of metadata. The use of the ontology-based knowledge
model allows the system to find indirectly relevant concepts in image captions and
thus leverage these to represent the semantics of images at a higher level.
Experimental results show that the proposed framework significantly enhances image
retrieval and leads to narrowing of the semantic gap between lower level machinederived
and higher level human-understandable conceptualisation
- …