9 research outputs found
Recognizing objects by piecing together the Segmentation Puzzle
We present an algorithm that recognizes objects of a given category using a small number of hand segmented images as references. Our method first over segments an input image into superpixels, and then finds a shortlist of optimal combinations of superpixels that best fit one of template parts, under affine transformations. Second, we develop a contextual interpretation of the parts, gluing image segments using top-down fiducial points, and checking overall shape similarity. In contrast to previous work, the search for candidate superpixel combinations is not exponential in the number of segments, and in fact leads to a very efficient detection scheme. Both the storage and the detection of templates only require space and time proportional to the length of the template boundary, allowing us to store potentially millions of templates, and to detect a template anywhere in a large image in roughly 0.01 seconds. We apply our algorithm on the Weizmann horse database, and show our method is comparable to the state of the art while offering a simpler and more efficient alternative compared to previous work
Hierarchical Object Parsing from Structured Noisy Point Clouds
Object parsing and segmentation from point clouds are challenging tasks
because the relevant data is available only as thin structures along object
boundaries or other features, and is corrupted by large amounts of noise. To
handle this kind of data, flexible shape models are desired that can accurately
follow the object boundaries. Popular models such as Active Shape and Active
Appearance models lack the necessary flexibility for this task, while recent
approaches such as the Recursive Compositional Models make model
simplifications in order to obtain computational guarantees. This paper
investigates a hierarchical Bayesian model of shape and appearance in a
generative setting. The input data is explained by an object parsing layer,
which is a deformation of a hidden PCA shape model with Gaussian prior. The
paper also introduces a novel efficient inference algorithm that uses informed
data-driven proposals to initialize local searches for the hidden variables.
Applied to the problem of object parsing from structured point clouds such as
edge detection images, the proposed approach obtains state of the art parsing
errors on two standard datasets without using any intensity information.Comment: 13 pages, 16 figure
Content-sensitive superpixel generation with boundary adjustment.
Superpixel segmentation has become a crucial tool in many image processing and computer vision applications. In this paper, a novel content-sensitive superpixel generation algorithm with boundary adjustment is proposed. First, the image local entropy was used to measure the amount of information in the image, and the amount of information was evenly distributed to each seed. It placed more seeds to achieve the lower under-segmentation in content-dense regions, and placed the fewer seeds to increase computational efficiency in content-sparse regions. Second, the Prim algorithm was adopted to generate uniform superpixels efficiently. Third, a boundary adjustment strategy with the adaptive distance further optimized the superpixels to improve the performance of the superpixel. Experimental results on the Berkeley Segmentation Database show that our method outperforms competing methods under evaluation metrics
Colour-based image retrieval algorithms based on compact colour descriptors and dominant colour-based indexing methods
Content based image retrieval (CBIR) is reported as one of the most active research
areas in the last two decades, but it is still young. Three CBIR’s performance problem in this study is inaccuracy of image retrieval, high complexity of feature extraction, and degradation of image retrieval after database indexing. This situation led to discrepancies to be applied on limited-resources devices (such as mobile devices). Therefore, the main objective of this thesis is to improve performance of CBIR. Images’ Dominant Colours (DCs) is selected as the key contributor for this purpose due to its compact property and its compatibility with the human visual system. Semantic image retrieval is proposed to solve retrieval inaccuracy problem by concentrating on the images’ objects. The effect of image background is reduced to provide more focus on the object by setting weights to the object and the background DCs. The accuracy improvement ratio is raised up to 50% over the compared methods. Weighting DCs framework is proposed to generalize this technique where it is demonstrated by applying it on many colour descriptors. For reducing high complexity of colour Correlogram in terms of computations and
memory space, compact representation of Correlogram is proposed. Additionally, similarity measure of an existing DC-based Correlogram is adapted to improve its accuracy. Both methods are incorporated to produce promising colour descriptor in terms of time and memory space complexity. As a result, the accuracy is increased up to 30% over the existing methods and the memory space is decreased to less than 10% of its original space. Converting the abundance of colours into a few DCs framework is proposed to generalize DCs concept. In addition, two DC-based
indexing techniques are proposed to overcome time problem, by using RGB and perceptual LUV colour spaces. Both methods reduce the search space to less than 25% of the database size with preserving the same accuracy
Bottom-up Object Segmentation for Visual Recognition
Automatic recognition and segmentation of objects in images is a central open problem in computer vision. Most previous approaches have pursued either sliding-window object detection or dense classification of overlapping local image patches. Differently, the framework introduced in this thesis attempts to identify the spatial extent of objects prior to recognition, using bottom-up computational processes and mid-level selection cues. After a set of plausible object hypotheses is identified, a sequential recognition process is executed, based on continuous estimates of the spatial overlap between the image segment hypotheses and each putative class. The object hypotheses are represented as figure-ground segmentations, and are extracted automatically, without prior knowledge of the properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. It is show that CPMC significantly outperforms the state of the art for low-level segmentation in the PASCAL VOC 2009 and 2010 datasets. Results beyond the current state of the art for image classification, object detection and semantic segmentation are also demonstrated in a number of challenging datasets including Caltech-101, ETHZ-Shape as well as PASCAL VOC 2009-11. These results suggest that a greater emphasis on grouping and image organization may be valuable for making progress in high-level tasks such as object recognition and scene understanding
Visual attention and swarm cognition for off-road robots
Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2011Esta tese aborda o problema da modelação de atenção visual no contexto de robôs autónomos todo-o-terreno. O objectivo de utilizar mecanismos de atenção visual é o de focar a percepção nos aspectos do ambiente mais relevantes à tarefa do robô. Esta tese mostra que, na detecção de obstáculos e de trilhos, esta capacidade promove robustez e parcimónia computacional. Estas são características chave para a rapidez e eficiência dos robôs todo-o-terreno. Um dos maiores desafios na modelação de atenção visual advém da necessidade de gerir o compromisso velocidade-precisão na presença de variações de contexto ou de tarefa. Esta tese mostra que este compromisso é resolvido se o processo de atenção visual for modelado como um processo auto-organizado, cuja operação é modulada pelo módulo de selecção de acção, responsável pelo controlo do robô. Ao fechar a malha entre o processo de selecção de acção e o de percepção, o último é capaz de operar apenas onde é necessário, antecipando as acções do robô. Para fornecer atenção visual com propriedades auto-organizadas, este trabalho obtém inspiração da Natureza. Concretamente, os mecanismos responsáveis pela capacidade que as formigas guerreiras têm de procurar alimento de forma auto-organizada, são usados como metáfora na resolução da tarefa de procurar, também de forma auto-organizada, obstáculos e trilhos no campo visual do robô. A solução proposta nesta tese é a de colocar vários focos de atenção encoberta a operar como um enxame, através de interacções baseadas em feromona. Este trabalho representa a primeira realização corporizada de cognição de enxame. Este é um novo campo de investigação que procura descobrir os princípios básicos da cognição, inspeccionando as propriedades auto-organizadas da inteligência colectiva exibida pelos insectos sociais. Logo, esta tese contribui para a robótica como disciplina de engenharia e para a robótica como disciplina de modelação, capaz de suportar o estudo do comportamento adaptável.Esta tese aborda o problema da modelação de atenção visual no contexto de robôs autónomos
todo-o-terreno. O objectivo de utilizar mecanismos de atenção visual é o de focar a percepção
nos aspectos do ambiente mais relevantes à tarefa do robô. Esta tese mostra que, na detecção de
obstáculos e de trilhos, esta capacidade promove robustez e parcimónia computacional. Estas
são características chave para a rapidez e eficiência dos robôs todo-o-terreno.
Um dos maiores desafios na modelação de atenção visual advém da necessidade de gerir o
compromisso velocidade-precisão na presença de variações de contexto ou de tarefa. Esta tese
mostra que este compromisso é resolvido se o processo de atenção visual for modelado como
um processo auto-organizado, cuja operação é modulada pelo módulo de selecção de acção,
responsável pelo controlo do robô. Ao fechar a malha entre o processo de selecção de acção e
o de percepção, o último é capaz de operar apenas onde é necessário, antecipando as acções do
robô.
Para fornecer atenção visual com propriedades auto-organizadas, este trabalho obtém inspi-
ração da Natureza. Concretamente, os mecanismos responsáveis pela capacidade que as formi-
gas guerreiras têm de procurar alimento de forma auto-organizada, são usados como metáfora
na resolução da tarefa de procurar, também de forma auto-organizada, obstáculos e trilhos no
campo visual do robô. A solução proposta nesta tese é a de colocar vários focos de atenção
encoberta a operar como um enxame, através de interacções baseadas em feromona.
Este trabalho representa a primeira realização corporizada de cognição de enxame. Este é
um novo campo de investigação que procura descobrir os princípios básicos da cognição, ins-
peccionando as propriedades auto-organizadas da inteligência colectiva exibida pelos insectos
sociais. Logo, esta tese contribui para a robótica como disciplina de engenharia e para a robótica
como disciplina de modelação, capaz de suportar o estudo do comportamento adaptável.Fundação para a Ciência e a Tecnologia (FCT,SFRH/BD/27305/2006); Laboratory of Agent Modelling (LabMag