3 research outputs found

    Automatic extraction of regions of interest from images based on visual attention models

    Get PDF
    This thesis presents a method for the extraction of regions of interest (ROIs) from images. By ROIs we mean the most prominent semantic objects in the images, of any size and located at any position in the image. The novel method is based on computational models of visual attention (VA), operates under a completely bottom-up and unsupervised way and does not present con-straints in the category of the input images. At the core of the architecture is de model VA proposed by Itti, Koch and Niebur and the one proposed by Stentiford. The first model takes into account color, intensity, and orientation features and provides coordinates corresponding to the points of attention (POAs) in the image. The second model considers color features and provides rough areas of attention (AOAs) in the image. In the proposed architecture, the POAs and AOAs are combined to establish the contours of the ROIs. Two implementations of this architecture are presented, namely 'first version' and 'improved version'. The first version mainly on traditional morphological operations and was applied in two novel region-based image retrieval systems. In the first one, images are clustered on the basis of the ROIs, instead of the global characteristics of the image. This provides a meaningful organization of the database images, since the output clusters tend to contain objects belonging to the same category. In the second system, we present a combination of the traditional global-based with region-based image retrieval under a multiple-example query scheme. In the improved version of the architecture, the main stages are a spatial coherence analysis between both VA models and a multiscale representation of the AOAs. Comparing to the first one, the improved version presents more versatility, mainly in terms of the size of the extracted ROIs. The improved version was directly evaluated for a wide variety of images from different publicly available databases, with ground truth in the form of bounding boxes and true object contours. The performance measures used were precision, recall, F1 and area overlap. Experimental results are of very high quality, particularly if one takes into account the bottom-up and unsupervised nature of the approach.UOL; CAPESEsta tese apresenta um método para a extração de regiões de interesse (ROIs) de imagens. No contexto deste trabalho, ROIs são definidas como os objetos semânticos que se destacam em uma imagem, podendo apresentar qualquer tamanho ou localização. O novo método baseia-se em modelos computacionais de atenção visual (VA), opera de forma completamente bottom-up, não supervisionada e não apresenta restrições com relação à categoria da imagem de entrada. Os elementos centrais da arquitetura são os modelos de VA propostos por Itti-Koch-Niebur e Stentiford. O modelo de Itti-Koch-Niebur considera as características de cor, intensidade e orientação da imagem e apresenta uma resposta na forma de coordenadas, correspondentes aos pontos de atenção (POAs) da imagem. O modelo Stentiford considera apenas as características de cor e apresenta a resposta na forma de áreas de atenção na imagem (AOAs). Na arquitetura proposta, a combinação de POAs e AOAs permite a obtenção dos contornos das ROIs. Duas implementações desta arquitetura, denominadas 'primeira versão' e 'versão melhorada' são apresentadas. A primeira versão utiliza principalmente operações tradicionais de morfologia matemática. Esta versão foi aplicada em dois sistemas de recuperação de imagens com base em regiões. No primeiro, as imagens são agrupadas de acordo com as ROIs, ao invés das características globais da imagem. O resultado são grupos de imagens mais significativos semanticamente, uma vez que o critério utilizado são os objetos da mesma categoria contidos nas imagens. No segundo sistema, á apresentada uma combinação da busca de imagens tradicional, baseada nas características globais da imagem, com a busca de imagens baseada em regiões. Ainda neste sistema, as buscas são especificadas através de mais de uma imagem exemplo. Na versão melhorada da arquitetura, os estágios principais são uma análise de coerência espacial entre as representações de ambos modelos de VA e uma representação multi-escala das AOAs. Se comparada à primeira versão, esta apresenta maior versatilidade, especialmente com relação aos tamanhos das ROIs presentes nas imagens. A versão melhorada foi avaliada diretamente, com uma ampla variedade de imagens diferentes bancos de imagens públicos, com padrões-ouro na forma de bounding boxes e de contornos reais dos objetos. As métricas utilizadas na avaliação foram presision, recall, F1 e area of overlap. Os resultados finais são excelentes, considerando-se a abordagem exclusivamente bottom-up e não-supervisionada do método

    Efficient Bayesian methods for clustering.

    Get PDF
    One of the most important goals of unsupervised learning is to discover meaningful clusters in data. Clustering algorithms strive to discover groups, or clusters, of data points which belong together because they are in some way similar. The research presented in this thesis focuses on using Bayesian statistical techniques to cluster data. We take a model-based Bayesian approach to defining a cluster, and evaluate cluster membership in this paradigm. Due to the fact that large data sets are increasingly common in practice, our aim is for the methods in this thesis to be efficient while still retaining the desirable properties which result from a Bayesian paradigm. We develop a Bayesian Hierarchical Clustering (BHC) algorithm which efficiently addresses many of the drawbacks of traditional hierarchical clustering algorithms. The goal of BHC is to construct a hierarchical representation of the data, incorporating both finer to coarser grained clusters, in such a way that we can also make predictions about new data points, compare different hierarchies in a principled manner, and automatically discover interesting levels of the hierarchy to examine. BHC can also be viewed as a fast way of performing approximate inference in a Dirichlet Process Mixture model (DPM), one of the cornerstones of nonparametric Bayesian Statistics. We create a new framework for retrieving desired information from large data collections, Bayesian Sets, using Bayesian clustering techniques. Unlike current retrieval methods, Bayesian Sets provides a principled framework which leverages the rich and subtle information provided by queries in the form of a set of examples. Whereas most clustering algorithms are completely unsupervised, here the query provides supervised hints or constraints as to the membership of a particular cluster. We call this "clustering on demand", since it involves forming a cluster once some elements of that cluster have been revealed. We use Bayesian Sets to develop a content-based image retrieval system. We also extend Bayesian Sets to a discriminative setting and use this to perform automated analogical reasoning. Lastly, we develop extensions of clustering in order to model data with more complex structure than that for which traditional clustering is intended. Clustering models traditionally assume that each data point belongs to one and only one cluster, and although they have proven to be a very powerful class of models, this basic assumption is somewhat limiting. For example, there may be overlapping regions where data points actually belong to multiple clusters, like movies which can each belong to multiple genres. We extend traditional mixture models to create a statistical model for overlapping clustering, the Infinite Overlapping Mixture Model (IOMM), in a non-parametric Bayesian setting, using the Indian Buffet Process (IBP). We also develop a Bayesian Partial Membership model (BPM), which allows data points to have partial membership in multiple clusters via a continuous relaxation of a finite mixture model

    Content-Based Image Retrieval Using Self-Organizing Maps

    Full text link
    corecore