195 research outputs found

    Colour-based image retrieval algorithms based on compact colour descriptors and dominant colour-based indexing methods

    Get PDF
    Content based image retrieval (CBIR) is reported as one of the most active research areas in the last two decades, but it is still young. Three CBIR’s performance problem in this study is inaccuracy of image retrieval, high complexity of feature extraction, and degradation of image retrieval after database indexing. This situation led to discrepancies to be applied on limited-resources devices (such as mobile devices). Therefore, the main objective of this thesis is to improve performance of CBIR. Images’ Dominant Colours (DCs) is selected as the key contributor for this purpose due to its compact property and its compatibility with the human visual system. Semantic image retrieval is proposed to solve retrieval inaccuracy problem by concentrating on the images’ objects. The effect of image background is reduced to provide more focus on the object by setting weights to the object and the background DCs. The accuracy improvement ratio is raised up to 50% over the compared methods. Weighting DCs framework is proposed to generalize this technique where it is demonstrated by applying it on many colour descriptors. For reducing high complexity of colour Correlogram in terms of computations and memory space, compact representation of Correlogram is proposed. Additionally, similarity measure of an existing DC-based Correlogram is adapted to improve its accuracy. Both methods are incorporated to produce promising colour descriptor in terms of time and memory space complexity. As a result, the accuracy is increased up to 30% over the existing methods and the memory space is decreased to less than 10% of its original space. Converting the abundance of colours into a few DCs framework is proposed to generalize DCs concept. In addition, two DC-based indexing techniques are proposed to overcome time problem, by using RGB and perceptual LUV colour spaces. Both methods reduce the search space to less than 25% of the database size with preserving the same accuracy

    Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model

    No full text
    International audienceIn the context of category level scene classification, the bag-of-visual-words model (BoVW) is widely used for image representation. This model is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. To overcome this problem, recent approaches try to capture information about either the absolute or the relative spatial location of visual words. In the first category, the so-called Spatial Pyramid Representation (SPR) is very popular thanks to its simplicity and good results. Alternatively, adding information about occurrences of relative spatial configurations of visual words was proven to be effective but at the cost of higher computational complexity, specifically when relative distance and angles are taken into account. In this paper, we introduce a novel way to incorporate both distance and angle information in the BoVW representation. The novelty is first to provide a computationally efficient representation adding relative spatial information between visual words and second to use a soft pairwise voting scheme based on the distance in the descriptor space. Experiments on challenging data sets MSRC-2, 15Scene, Caltech101, Caltech256 and Pascal VOC 2007 demonstrate that our method outperforms or is competitive with concurrent ones. We also show that it provides important complementary information to the spatial pyramid matching and can improve the overall performance

    Advanced shape context for plant species identification using leaf image retrieval

    Get PDF
    International audienceThis paper presents a novel method for leaf species identification combining local and shape-based features. Our approach extends the shape context model in two ways. First of all, two different sets of points are distinguished when computing the shape contexts: the voting set, i.e. the points used to describe the coarse arrangement of the shape and the computing set containing the points where the shape contexts are computed. This representation is enriched by introducing local features computed in the neighborhood of the computing points. Experiments show the effectiveness of our approach

    Geometric and Photometric Data Fusion in Non-Rigid Shape Analysis

    Get PDF
    In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local and global shape descriptors. Our construction is based on the definition of a diffusion process on the shape manifold embedded into a high-dimensional space where the embedding coordinates represent the photometric information. Experimental results show that such data fusion is useful in coping with different challenges of shape analysis where pure geometric and pure photometric methods fai

    Improving Bags-of-Words model for object categorization

    Get PDF
    In the past decade, Bags-of-Words (BOW) models have become popular for the task of object recognition, owing to their good performance and simplicity. Some of the most effective recent methods for computer-based object recognition work by detecting and extracting local image features, before quantizing them according to a codebook rule such as k-means clustering, and classifying these with conventional classifiers such as Support Vector Machines and Naive Bayes. In this thesis, a Spatial Object Recognition Framework is presented that consists of the four main contributions of the research. The first contribution, frequent keypoint pattern discovery, works by combining pairs and triplets of frequent keypoints in order to discover intermediate representations for object classes. Based on the same frequent keypoints principle, algorithms for locating the region-of-interest in training images is then discussed. Extensions to the successful Spatial Pyramid Matching scheme, in order to better capture spatial relationships, are then proposed. The pairs frequency histogram and shapes frequency histogram work by capturing more redefined spatial information between local image features. Finally, alternative techniques to Spatial Pyramid Matching for capturing spatial information are presented. The proposed techniques, variations of binned log-polar histograms, divides the image into grids of different scale and different orientation. Thus captures the distribution of image features both in distance and orientation explicitly. Evaluations on the framework are focused on several recent and popular datasets, including image retrieval, object recognition, and object categorization. Overall, while the effectiveness of the framework is limited in some of the datasets, the proposed contributions are nevertheless powerful improvements of the BOW model

    Statistical spatial color information modeling in images and applications

    Get PDF
    Image processing, among its vast applications, has proven particular efficiency in quality control systems. Quality control systems such as the ones in the food industry, fruits and meat industries, pharmaceutic, and hardness testing are highly dependent on the accuracy of the algorithms used to extract image feature vectors and process them. Thus, the need to build better quality systems is tied to the progress in the field of image processing. Color histograms have been widely and successfully used in many computer vision and image processing applications. However, they do not include any spatial information. We propose statistical models to integrate both color and spatial information. Our first model is based on finite mixture models which have been applied to different computer vision, image processing and pattern recognition tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited. In this thesis, we investigate the problem of discrete data modeling using finite mixture models. We propose a novel, well motivated mixture that we call a multinomial generalized Dirichlet mixture. Our second model is based on finite multiple-Bernoulli mixtures. For the estimation of the model's parameters, we use a maximum a posteriori (MAP) approach through deterministic annealing expectation maximization (DAEM). Smoothing priors to the components parameters are introduced to stabilize the estimation. The selection of the number of clusters is based on stochastic complexit

    Scene Segmentation and Object Classification for Place Recognition

    Get PDF
    This dissertation tries to solve the place recognition and loop closing problem in a way similar to human visual system. First, a novel image segmentation algorithm is developed. The image segmentation algorithm is based on a Perceptual Organization model, which allows the image segmentation algorithm to ‘perceive’ the special structural relations among the constituent parts of an unknown object and hence to group them together without object-specific knowledge. Then a new object recognition method is developed. Based on the fairly accurate segmentations generated by the image segmentation algorithm, an informative object description that includes not only the appearance (colors and textures), but also the parts layout and shape information is built. Then a novel feature selection algorithm is developed. The feature selection method can select a subset of features that best describes the characteristics of an object class. Classifiers trained with the selected features can classify objects with high accuracy. In next step, a subset of the salient objects in a scene is selected as landmark objects to label the place. The landmark objects are highly distinctive and widely visible. Each landmark object is represented by a list of SIFT descriptors extracted from the object surface. This object representation allows us to reliably recognize an object under certain viewpoint changes. To achieve efficient scene-matching, an indexing structure is developed. Both texture feature and color feature of objects are used as indexing features. The texture feature and the color feature are viewpoint-invariant and hence can be used to effectively find the candidate objects with similar surface characteristics to a query object. Experimental results show that the object-based place recognition and loop detection method can efficiently recognize a place in a large complex outdoor environment

    Exploring Language Mechanisms: The Mass-Count Distinction and The Potts Neural Network

    Get PDF
    The aim of this thesis is to explore language mechanisms in two aspects. First, the statistical properties of syntax and semantics, and second, the neural mechanisms which could be of possible use in trying to understand how the brain learns those particular statistical properties. In the first part of the thesis (part A) we focus our attention on a detailed statistical study of the syntax and semantics of the mass-count distinction in nouns. We collected a database of how 1,434 nouns are used with respect to the mass-count distinction in six languages; additional informants characterised the semantics of the underlying concepts. Results indicate only weak correlations between semantics and syntactic usage. The classification rather than being bimodal, is a graded distribution and it is similar across languages, but syntactic classes do not map onto each other, nor do they reflect, beyond weak correlations, semantic attributes of the concepts. These findings are in line with the hypothesis that much of the mass/count syntax emerges from language- and even speaker-specific grammaticalisation. Further, in chapter 3 we test the ability of a simple neural network to learn the syntactic and semantic relations of nouns, in the hope that it may throw some light on the challenges in modelling the acquisition of the mass-count syntax. It is shown that even though a simple self-organising neural network is insufficient to learn a mapping implementing a syntactic- semantic link, it does however show that the network was able to extract the concept of 'count', and to some extent that of \u2018mass\u2019 as well, without any explicit definition, from both the syntactic and from the semantic data. The second part of the thesis (part B) is dedicated to studying the properties of the Potts neural network. The Potts neural network with its adaptive dynamics represents a simplified model of cortical mechanisms. Among other cognitive phenomena, it intends to model language production by utilising the latching behaviour seen in the network. We expect that a model of language processing should robustly handle various syntactic- semantic correlations amongst the words of a language. With this aim, we test the effect on storage capacity of the Potts network when the memories stored in it share non trivial correlations. Increase in interference between stored memories due to correlations is studied along with modifications in learning rules to reduce the interference. We find that when strongly correlated memories are incorporated in the storage capacity definition, the network is able to regain its storage capacity for low sparsity. Strong correlations also affect the latching behaviour of the Potts network with the network unable to latch from one memory to another. However latching is shown to be restored by modifying the learning rule. Lastly, we look at another feature of the Potts neural network, the indication that it may exhibit spin-glass characteristics. The network is consistently shown to exhibit multiple stable degenerate energy states other than that of pure memories. This is tested for different degrees of correlations in patterns, low and high connectivity, and different levels of global and local noise. We state some of the implications that the spin-glass nature of the Potts neural network may have on language processing
    corecore