93,945 research outputs found
Multi modal multi-semantic image retrieval
PhDThe rapid growth in the volume of visual information, e.g. image, and video can
overwhelm users’ ability to find and access the specific visual information of interest
to them. In recent years, ontology knowledge-based (KB) image information retrieval
techniques have been adopted into in order to attempt to extract knowledge from these
images, enhancing the retrieval performance. A KB framework is presented to
promote semi-automatic annotation and semantic image retrieval using multimodal
cues (visual features and text captions). In addition, a hierarchical structure for the KB
allows metadata to be shared that supports multi-semantics (polysemy) for concepts.
The framework builds up an effective knowledge base pertaining to a domain specific
image collection, e.g. sports, and is able to disambiguate and assign high level
semantics to ‘unannotated’ images.
Local feature analysis of visual content, namely using Scale Invariant Feature
Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’
model (BVW) as an effective method to represent visual content information and to
enhance its classification and retrieval. Local features are more useful than global
features, e.g. colour, shape or texture, as they are invariant to image scale, orientation
and camera angle. An innovative approach is proposed for the representation,
annotation and retrieval of visual content using a hybrid technique based upon the use
of an unstructured visual word and upon a (structured) hierarchical ontology KB
model. The structural model facilitates the disambiguation of unstructured visual
words and a more effective classification of visual content, compared to a vector
space model, through exploiting local conceptual structures and their relationships.
The key contributions of this framework in using local features for image
representation include: first, a method to generate visual words using the semantic
local adaptive clustering (SLAC) algorithm which takes term weight and spatial
locations of keypoints into account. Consequently, the semantic information is
preserved. Second a technique is used to detect the domain specific ‘non-informative
visual words’ which are ineffective at representing the content of visual data and
degrade its categorisation ability. Third, a method to combine an ontology model with
xi
a visual word model to resolve synonym (visual heterogeneity) and polysemy
problems, is proposed. The experimental results show that this approach can discover
semantically meaningful visual content descriptions and recognise specific events,
e.g., sports events, depicted in images efficiently.
Since discovering the semantics of an image is an extremely challenging problem, one
promising approach to enhance visual content interpretation is to use any associated
textual information that accompanies an image, as a cue to predict the meaning of an
image, by transforming this textual information into a structured annotation for an
image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct
types of information representation and modality, there are some strong, invariant,
implicit, connections between images and any accompanying text information.
Semantic analysis of image captions can be used by image retrieval systems to
retrieve selected images more precisely. To do this, a Natural Language Processing
(NLP) is exploited firstly in order to extract concepts from image captions. Next, an
ontology-based knowledge model is deployed in order to resolve natural language
ambiguities. To deal with the accompanying text information, two methods to extract
knowledge from textual information have been proposed. First, metadata can be
extracted automatically from text captions and restructured with respect to a semantic
model. Second, the use of LSI in relation to a domain-specific ontology-based
knowledge model enables the combined framework to tolerate ambiguities and
variations (incompleteness) of metadata. The use of the ontology-based knowledge
model allows the system to find indirectly relevant concepts in image captions and
thus leverage these to represent the semantics of images at a higher level.
Experimental results show that the proposed framework significantly enhances image
retrieval and leads to narrowing of the semantic gap between lower level machinederived
and higher level human-understandable conceptualisation
Enhancing Automatic Annotation for Optimal Image Retrieval
Image search and retrieval based on content is very cumbersome task particularly when the image database is large. The accuracy of the retrieval as well as the processing speed are two important measures used for assessing and comparing the effectiveness of various systems.
Text retrieval is more mature and advanced than image content retrieval. In this dissertation, the focus is on converting image content into text tags that can be easily searched using standard search engines where the size and speed issues of the database have been already dealt with.
Therefore, image tagging becomes an essential tool for image retrieval from large image databases. Automation of image tagging has received considerable attention by many researchers in recent years. The optimal goal of image description is to automatically annotate images with tags that semantically represent the image content. The speed and accuracy of Image retrieval from large databases are few of the important domains that can benefit from automatic tagging.
In this work, several state of the art image classification and image tagging techniques are reviewed. We propose a new self-learning multilayered tagging framework that can address the limitations of current approaches and provide mutual accuracy improvement between the recognition layer and the annotation layer. Our results indicate that the proposed framework can improve the overall accuracy of information retrieval in a variety of image databases
Shape-Based Single Object Classification Using Ensemble Method Classifiers
Nowadays, more and more images are available. Annotation and retrieval of the images pose classification problems, where each class is defined as the group of database images labelled with a common semantic label. Various systems have been proposed for content-based retrieval, as well as for image classification and indexing. In this paper, a hierarchical classification framework has been proposed for bridging the semantic gap effectively and achieving multi-category image classification. A well-known pre-processing and post-processing method was used and applied to three problems; image segmentation, object identification and image classification. The method was applied to classify single object images from Amazon and Google datasets. The classification was tested for four different classifiers; BayesNetwork (BN), Random Forest (RF), Bagging and Vote. The estimated classification accuracies ranged from 20% to 99% (using 10-fold cross validation). The Bagging classifier presents the best performance, followed by the Random Forest classifier
An Efficient Perceptual of Content Based Image Retrieval System Using SVM and Evolutionary Algorithms
The CBIR tends to index and retrieve images based on their visual content. CBIR avoids several issues related to traditional ways that of retrieving images by keywords. Thus, a growing interest within the area of CBIR has been established in recent years. The performance of a CBIR system mainly depends on the particular image representation and similarity matching operate utilized. The CBIR tends to index and retrieve images supported their visual content. CBIR avoids several issues related to traditional ways that of retrieving images by keywords. Thus, a growing interest within the area of CBIR has been established in recent years. The performance of a CBIR system principally depends on the actual image illustration and similarity matching operate utilized. therefore a replacement CBIR system is projected which can give accurate results as compared to previously developed systems. This introduces the new composite framework for image classification in a content-based image retrieval system. The projected composite framework uses an evolutionary algorithm to select training samples for support vector machine (SVM). to style such a system, the most popular techniques of content-based image retrieval are reviewed initial. Our review reveals some limitations of the existing techniques, preventing them to accurately address some issues
Recommended from our members
Image processing and understanding based on graph similarity testing: algorithm design and software development
Image processing and understanding is a key task in the human visual system. Among all related topics, content based image retrieval and classification is the most typical and important problem. Successful image retrieval/classification models require an effective fundamental step of image representation and feature extraction. While traditional methods are not capable of capturing all structural information on the image, using graph to represent the image is not only biologically plausible but also has certain advantages.
Graphs have been widely used in image related applications. Traditional graph-based image analysis models include pixel-based graph-cut techniques for image segmentation, low-level and high-level image feature extraction based on graph statistics and other related approaches which utilize the idea of graph similarity testing. To compare the images through their graph representations, a graph similarity testing algorithm is essential. Most of the existing graph similarity measurement tools are not designed for generic tasks such as image classification and retrieval, and some other models are either not scalable or not always effective. Graph spectral theory is a powerful analytical tool for capturing and representing structural information of the graph, but to use it on image understanding remains a challenge.
In this dissertation, we focus on developing fast and effective image analysis models based on the spectral graph theory and other graph related mathematical tools. We first propose a fast graph similarity testing method based on the idea of the heat content and the mathematical theory of diffusion over manifolds. We then demonstrate the ability of our similarity testing model by comparing random graphs and power law graphs. Based on our graph analysis model, we develop a graph-based image representation and understanding framework. We propose the image heat content feature at first and then discuss several approaches to further improve the model. The first component in our improved framework is a novel graph generation model. The proposed model greatly reduces the size of the traditional pixel-based image graph representation and is shown to still be effective in representing an image. Meanwhile, we propose and discuss several low-level and high-level image features based on spectral graph information, including oscillatory image heat content, weighted eigenvalues and weighted heat content spectrum. Experiments show that the proposed models are invariant to non-structural changes on images and perform well in standard image classification benchmarks. Furthermore, our image features are robust to small distortions and changes of viewpoint. The model is also capable of capturing important image structural information on the image and performs well alone or in combination with other traditional techniques. We then introduce two real world software development projects using graph-based image processing techniques in this dissertation. Finally, we discuss the pros, cons and the intuition of our proposed model by demonstrating the properties of the proposed image feature and the correlation between different image features
Stochasticity: A Feature for the Structuring of Large and Heterogeneous Image Databases
International audienceThe paper addresses image feature characterization and the structuring of large and heterogeneous image databases through the stochasticity or randomness appearance. Measuring stochasticity involves finding suitable representations that can significantly reduce statistical dependencies of any order. Wavelet packet representations provide such a framework for a large class of stochastic processes through an appropriate dictionary of parametric models. From this dictionary and the Kolmogorov stochasticity index, the paper proposes semantic stochasticity templates upon wavelet packet sub-bands in order to provide high level classification and content-based image retrieval. The approach is shown to be relevant for texture images
- …