6 research outputs found

    Image Retrieval : Modelling Keywords via Low-level Features

    Get PDF
    Advisors: Nicolas Tsapatsoulis. Date and location of PhD thesis defense: 29 April 2014, Cyprus University of TechnologyWith the advent of cheap digital recording and storage devices and the rapidly increasing popularity of online social networks that make extended use of visual information, like Facebook and Instagram, image retrieval regained great attention among the researchers in the areas of image indexing and retrieval. Image retrieval methods are mainly falling into content-based and text-based frameworks.Although content-based image retrieval has attracted large amount of research interest, the difficulties inquerying by an example propel ultimate users towards text queries. Searching by text queries yields more effective and accurate results that meet the needs of the users while at the same time preserves their familiarity with the way traditional search engines operate. However, text-based image retrieval requires images to be annotated i.e. they are related to text information. Much effort has been invested on automatic image annotation methods [1], since the manual assignment of keywords (which is necessary for text-based image retrieval) is a time consuming and labour intensive procedure [2].In automatic image annotation, a manually annotated set of data is used to train a system for the identification of joint or conditional probability of an annotation occurring together with a certain distribution of feature vectors corresponding to image content [3]. Different models and machine learning techniques were developed to learn the correlation between image features and textual words based on examples of annotated images. Learned models of this correlation are then applied to predict keywords for unseen images [4]. In the literature of automatic semantic image annotation, proposed approaches tend to classify images using only abstract terms or using holistic image features for both abstract terms and object classes. The extraction and selection of low-level features, either holistic or from particular image areas is of primary importance for automatic image annotation. This is true either for the content-based or for the text-based retrieval paradigm. In the former case the use of appropriate low-level features leads to accurate and effective object class models used in object detection while in the latter case, the better the low- level features are, the easier the learning of keyword models is.The intent of the image classification is to categorize the content of the input image to one of several keyword classes. A proper image annotation may contain more than one keyword that is relevant to the image content, so a reclassification process is required in this case, as well as whenever a new keyword class is added to the classification scheme. The creation of separate visual models for all keyword classes adds a significant value in automatic image annotation since several keywords can be assigned to the input image. As the number of keyword classes increases the number of keywords assigned to the images also increases too and there is no need for reclassification. However, the keyword modeling incurred various issues such as the large amount of manual effort required in developing the training data, the differences in interpretation of image contents, and the inconsistency of the keyword assignments among different annotators.This thesis focuses on image retrieval using keywords under the perspective of machine learning. It covers different aspects of the current research in this area including low-level feature extraction, creation of training sets and development of machine learning methodologies. It also proposes the idea of addressing automatic image annotation by creating visual models, one for each available keyword, and presents several examples of the proposed idea by comparing different features and machine learning algorithms in creating visual models for keywords referring to the athletics domain.The idea of automatic image annotation through independent keyword visual models is divided into two main parts: the training and automatic image annotation. In the first part, visual models for all available keywords are created, using the one-against-all training paradigm, while in the second part, annotations are produced for a given image based on the output of these models, once they are fed with a feature vector extracted from the input image. An accurate manually annotated dataset containing pairs of images and annotations is prerequisite for a successful automatic image annotation. Since the manual annotations are likely to contain human judgment errors and subjectivity in interpreting the image, the current thesis investigates the factors that influence the creation of manually annotated image datasets [5]. It also proposes the idea of modeling the knowledge of several people by creating visual models using such training data, aiming to significantly improve the ultimate efficiency of image retrieval systems [6].Moreover, it proposes a new algorithm for the extraction of low level features. The Spatial Histogramof Keypoints (SHiK) [7], keeps the spatial information of localized keypoints, on an effort to overcome the limitations caused by the non-fixed and huge dimensionality of the SIFT feature vector when used in machine learning frameworks. SHiK partitions the image into a fixed number of ordered sub-regions based on the Hilbert space-Filling curve and counts the localized keypoints found inside each sub-region. The resulting spatial histogram is a compact and discriminative low-level feature vector that shows significantly improved performance on classification tasks

    A Federated Approach for Fine-Grained Classification of Fashion Apparel

    Full text link
    As online retail services proliferate and are pervasive in modern lives, applications for classifying fashion apparel features from image data are becoming more indispensable. Online retailers, from leading companies to start-ups, can leverage such applications in order to increase profit margin and enhance the consumer experience. Many notable schemes have been proposed to classify fashion items, however, the majority of which focused upon classifying basic-level categories, such as T-shirts, pants, skirts, shoes, bags, and so forth. In contrast to most prior efforts, this paper aims to enable an in-depth classification of fashion item attributes within the same category. Beginning with a single dress, we seek to classify the type of dress hem, the hem length, and the sleeve length. The proposed scheme is comprised of three major stages: (a) localization of a target item from an input image using semantic segmentation, (b) detection of human key points (e.g., point of shoulder) using a pre-trained CNN and a bounding box, and (c) three phases to classify the attributes using a combination of algorithmic approaches and deep neural networks. The experimental results demonstrate that the proposed scheme is highly effective, with all categories having average precision of above 93.02%, and outperforms existing Convolutional Neural Networks (CNNs)-based schemes.Comment: 11 pages, 4 figures, 5 tables, submitted to IEEE ACCESS (under review

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Shortest Route at Dynamic Location with Node Combination-Dijkstra Algorithm

    Get PDF
    Abstract— Online transportation has become a basic requirement of the general public in support of all activities to go to work, school or vacation to the sights. Public transportation services compete to provide the best service so that consumers feel comfortable using the services offered, so that all activities are noticed, one of them is the search for the shortest route in picking the buyer or delivering to the destination. Node Combination method can minimize memory usage and this methode is more optimal when compared to A* and Ant Colony in the shortest route search like Dijkstra algorithm, but can’t store the history node that has been passed. Therefore, using node combination algorithm is very good in searching the shortest distance is not the shortest route. This paper is structured to modify the node combination algorithm to solve the problem of finding the shortest route at the dynamic location obtained from the transport fleet by displaying the nodes that have the shortest distance and will be implemented in the geographic information system in the form of map to facilitate the use of the system. Keywords— Shortest Path, Algorithm Dijkstra, Node Combination, Dynamic Location (key words

    Spatial histogram of keypoints

    No full text
    Among a variety of feature extraction approaches, special attention has been given to the SIFT algorithm which delivers good results for many applications. However, the non fixed and huge dimensionality of the extracted SIFT feature vector cause certain limitations when it is used in machine learning frameworks. In this paper, we introduce Spatial Histogram of Keypoints (SHiK), which keeps the spatial information of localized keypoints, on an effort to overcome this limitation. The proposed technique partitions the image into a fixed number of ordered sub-regions based on the Hilbert space- filling curve and counts the localized keypoints found inside each sub-region. The resulting spatial histogram is a compact and discriminative low-level feature vector that shows significantly improved performance on classification tasks. The proposed method achieves high accuracy on different datasets and performs significantly better on scene datasets compared to the Spatial Pyramid Matching method

    Image Retrieval : Modelling Keywords via Low-level Features

    No full text
    Advisors: Nicolas Tsapatsoulis. Date and location of PhD thesis defense: 29 April 2014, Cyprus University of TechnologyWith the advent of cheap digital recording and storage devices and the rapidly increasing popularity of online social networks that make extended use of visual information, like Facebook and Instagram, image retrieval regained great attention among the researchers in the areas of image indexing and retrieval. Image retrieval methods are mainly falling into content-based and text-based frameworks.Although content-based image retrieval has attracted large amount of research interest, the difficulties inquerying by an example propel ultimate users towards text queries. Searching by text queries yields more effective and accurate results that meet the needs of the users while at the same time preserves their familiarity with the way traditional search engines operate. However, text-based image retrieval requires images to be annotated i.e. they are related to text information. Much effort has been invested on automatic image annotation methods [1], since the manual assignment of keywords (which is necessary for text-based image retrieval) is a time consuming and labour intensive procedure [2].In automatic image annotation, a manually annotated set of data is used to train a system for the identification of joint or conditional probability of an annotation occurring together with a certain distribution of feature vectors corresponding to image content [3]. Different models and machine learning techniques were developed to learn the correlation between image features and textual words based on examples of annotated images. Learned models of this correlation are then applied to predict keywords for unseen images [4]. In the literature of automatic semantic image annotation, proposed approaches tend to classify images using only abstract terms or using holistic image features for both abstract terms and object classes. The extraction and selection of low-level features, either holistic or from particular image areas is of primary importance for automatic image annotation. This is true either for the content-based or for the text-based retrieval paradigm. In the former case the use of appropriate low-level features leads to accurate and effective object class models used in object detection while in the latter case, the better the low- level features are, the easier the learning of keyword models is.The intent of the image classification is to categorize the content of the input image to one of several keyword classes. A proper image annotation may contain more than one keyword that is relevant to the image content, so a reclassification process is required in this case, as well as whenever a new keyword class is added to the classification scheme. The creation of separate visual models for all keyword classes adds a significant value in automatic image annotation since several keywords can be assigned to the input image. As the number of keyword classes increases the number of keywords assigned to the images also increases too and there is no need for reclassification. However, the keyword modeling incurred various issues such as the large amount of manual effort required in developing the training data, the differences in interpretation of image contents, and the inconsistency of the keyword assignments among different annotators.This thesis focuses on image retrieval using keywords under the perspective of machine learning. It covers different aspects of the current research in this area including low-level feature extraction, creation of training sets and development of machine learning methodologies. It also proposes the idea of addressing automatic image annotation by creating visual models, one for each available keyword, and presents several examples of the proposed idea by comparing different features and machine learning algorithms in creating visual models for keywords referring to the athletics domain.The idea of automatic image annotation through independent keyword visual models is divided into two main parts: the training and automatic image annotation. In the first part, visual models for all available keywords are created, using the one-against-all training paradigm, while in the second part, annotations are produced for a given image based on the output of these models, once they are fed with a feature vector extracted from the input image. An accurate manually annotated dataset containing pairs of images and annotations is prerequisite for a successful automatic image annotation. Since the manual annotations are likely to contain human judgment errors and subjectivity in interpreting the image, the current thesis investigates the factors that influence the creation of manually annotated image datasets [5]. It also proposes the idea of modeling the knowledge of several people by creating visual models using such training data, aiming to significantly improve the ultimate efficiency of image retrieval systems [6].Moreover, it proposes a new algorithm for the extraction of low level features. The Spatial Histogramof Keypoints (SHiK) [7], keeps the spatial information of localized keypoints, on an effort to overcome the limitations caused by the non-fixed and huge dimensionality of the SIFT feature vector when used in machine learning frameworks. SHiK partitions the image into a fixed number of ordered sub-regions based on the Hilbert space-Filling curve and counts the localized keypoints found inside each sub-region. The resulting spatial histogram is a compact and discriminative low-level feature vector that shows significantly improved performance on classification tasks
    corecore