17 research outputs found
A Novel Semantic Statistical Model for Automatic Image Annotation Using the Relationship between the Regions Based on Multi-Criteria Decision Making
Automatic image annotation has emerged as an important research topic due to the existence of the semantic gap and in addition to its potential application on image retrieval and management. In this paper we present an approach which combines regional contexts and visual topics to automatic image annotation. Regional contexts model the relationship between the regions, whereas visual topics provide the global distribution of topics over an image. Conventional image annotation methods neglected the relationship between the regions in an image, while these regions are exactly explanation of the image semantics, therefore considering the relationship between them are helpful to annotate the images. The proposed model extracts regional contexts and visual topics from the image, and incorporates them by MCDM (Multi Criteria Decision Making) approach based on TOPSIS (Technique for Order Preference by Similarity to the Ideal Solution) method. Regional contexts and visual topics are learned by PLSA (Probability Latent Semantic Analysis) from the training data. The experiments on 5k Corel images show that integrating these two kinds of information is beneficial to image annotation.DOI:http://dx.doi.org/10.11591/ijece.v4i1.459
Automatic Image Annotation Based on Particle Swarm Optimization and Support Vector Clustering
With the progress of network technology, there are more and more digital images of the internet. But most images are not semantically marked, which makes it difficult to retrieve and use. In this paper, a new algorithm is proposed to automatically annotate images based on particle swarm optimization (PSO) and support vector clustering (SVC). The algorithm includes two stages: firstly, PSO algorithm is used to optimize SVC; secondly, the trained SVC algorithm is used to annotate the image automatically. In the experiment, three datasets are used to evaluate the algorithm, and the results show the effectiveness of the algorithm
Image Tagging using Modified Association Rule based on Semantic Neighbors
With the rapid development of the internet, mobiles, and social image-sharing websites, a large number of images are generated daily. The huge repository of the images poses challenges for an image retrieval system. On image-sharing social websites such as Flickr, the users can assign keywords/tags to the images which can describe the content of the images. These tags play important role in an image retrieval system. However, the user-assigned tags are highly personalized which brings many challenges for retrieval of the images. Thus, it is necessary to suggest appropriate tags to the images.
Existing methods for tag recommendation based on nearest neighbors ignore the relationship between tags. In this paper, the method is proposed for tag recommendations for the images based on semantic neighbors using modified association rule. Given an image, the method identifies the semantic neighbors using random forest based on the weight assigned to each category. The tags associated with the semantic neighbors are used as candidate tags. The candidate tags are expanded by mining tags using modified association rules where each semantic neighbor is considered a transaction. In modified association rules, the probability of each tag is calculated using TF-IDF and confidence value.
The experimentation is done on Flickr, NUS-WIDE, and Corel-5k datasets. The result obtained using the proposed method gives better performance as compared to the existing tag recommendation methods
Robust Image Analysis by L1-Norm Semi-supervised Learning
This paper presents a novel L1-norm semi-supervised learning algorithm for
robust image analysis by giving new L1-norm formulation of Laplacian
regularization which is the key step of graph-based semi-supervised learning.
Since our L1-norm Laplacian regularization is defined directly over the
eigenvectors of the normalized Laplacian matrix, we successfully formulate
semi-supervised learning as an L1-norm linear reconstruction problem which can
be effectively solved with sparse coding. By working with only a small subset
of eigenvectors, we further develop a fast sparse coding algorithm for our
L1-norm semi-supervised learning. Due to the sparsity induced by sparse coding,
the proposed algorithm can deal with the noise in the data to some extent and
thus has important applications to robust image analysis, such as noise-robust
image classification and noise reduction for visual and textual bag-of-words
(BOW) models. In particular, this paper is the first attempt to obtain robust
image representation by sparse co-refinement of visual and textual BOW models.
The experimental results have shown the promising performance of the proposed
algorithm.Comment: This is an extension of our long paper in ACM MM 201
Combining Image-Level and Segment-Level Models for Automatic Annotation
Abstract. For the task of assigning labels to an image to summarize its contents, many early attempts use segment-level information and try to determine which parts of the images correspond to which labels. Best performing methods use global image similarity and nearest neighbor techniques to transfer labels from training images to test images. However, global methods cannot localize the labels in the images, unlike segment-level methods. Also, they cannot take advantage of training images that are only locally similar to a test image. We propose several ways to combine recent image-level and segment-level techniques to predict both image and segment labels jointly. We cast our experimental study in an unified framework for both image-level and segment-level annotation tasks. On three challenging datasets, our joint prediction of image and segment labels outperforms either prediction alone on both tasks. This confirms that the two levels offer complementary information
Apprentissage de distance pour l'annotation d'images par plus proches voisins
National audienceL'annotation automatique d'image est un probleme ouvert important pour la vision par ordinateur. Pour cette tache nous proposons TagProp, un modele par plus proche voisins ponderes. Celui-ci est entraine de maniere discriminative et exploite des images d'apprentissage pour predire les labels des images de test. Les poids sont calcules a partir du rang ou de la distance entre l'image et son voisin. TagProp permet l'optimisation de la distance qui definit les voisinages en maximisant la log-vraisemblance des predictions de l'ensemble d'apprentissage. Ainsi, nous pouvons regler de maniere optimale la combinaison de plusieurs similarites visuelles qui vont des histogrammes globaux de couleur aux descriptions locales de forme. Nous proposons egalement de moduler specifiquement chaque mot pour augmenter le rappel des mots rares. Nous comparons les performances des differentes variantes de notre modele a l'etat de l'art sur trois bases d'images. Sur les cinq mesures considerees, TagProp ameliore significativement l'etat de l'art