136 research outputs found
Mesh and Pyramid Algorithms for Iconic Indexing
In this paper parallel algorithms on meshes and pyramids for iconic indexing are presented. Our algorithms are asymptotically superior to previously known parallel algorithms
Affine invariant Matching Pursuit-based shape representation and recognition using scale-space
In this paper, we propose an analytical low-level representation of images, obtained by a decomposition process, here the matching pursuit (MP) algorithm, as a new way of describing objects through a general continuous description using an affine invariant dictionary of basis functions. This description is used to recognize objects in images. In the learning phase, a template object is decomposed, and the extracted subset of basis functions, called meta-atom, gives the description of our object. We then extend naturally this description into the linear scale-space using the definition of our basis functions, and thus bringing a more general representation of our object. We use this enhanced description as a predefined dictionary of the object to conduct an MP-based shape recognition (MPSR) task into the linear scale-space. The introduction of the scale-space approach improves the robustness of our method, and permits to avoid local minima problems encountered when minimizing a non-convex energy function. We show results for the detection of complex synthetic shapes, as well as natural (aerial and medical) images
Observations on Cortical Mechanisms for Object Recognition andsLearning
This paper sketches a hypothetical cortical architecture for visual 3D object recognition based on a recent computational model. The view-centered scheme relies on modules for learning from examples, such as Hyperbf-like networks. Such models capture a class of explanations we call Memory-Based Models (MBM) that contains sparse population coding, memory-based recognition, and codebooks of prototypes. Unlike the sigmoidal units of some artificial neural networks, the units of MBMs are consistent with the description of cortical neurons. We describe how an example of MBM may be realized in terms of cortical circuitry and biophysical mechanisms, consistent with psychophysical and physiological data
Variable Resolution & Dimensional Mapping For 3d Model Optimization
Three-dimensional computer models, especially geospatial architectural data sets, can be visualized in the same way humans experience the world, providing a realistic, interactive experience. Scene familiarization, architectural analysis, scientific visualization, and many other applications would benefit from finely detailed, high resolution, 3D models. Automated methods to construct these 3D models traditionally has produced data sets that are often low fidelity or inaccurate; otherwise, they are initially highly detailed, but are very labor and time intensive to construct. Such data sets are often not practical for common real-time usage and are not easily updated. This thesis proposes Variable Resolution & Dimensional Mapping (VRDM), a methodology that has been developed to address some of the limitations of existing approaches to model construction from images. Key components of VRDM are texture palettes, which enable variable and ultra-high resolution images to be easily composited; texture features, which allow image features to integrated as image or geometry, and have the ability to modify the geometric model structure to add detail. These components support a primary VRDM objective of facilitating model refinement with additional data. This can be done until the desired fidelity is achieved as practical limits of infinite detail are approached. Texture Levels, the third component, enable real-time interaction with a very detailed model, along with the flexibility of having alternate pixel data for a given area of the model and this is achieved through extra dimensions. Together these techniques have been used to construct models that can contain GBs of imagery data
Matching pursuit-based shape representation and recognition using scale-space
In this paper, we propose an analytical low-level representation of images, obtained by a decomposition process, namely the matching pursuit (MP) algorithm, as a new way of describing objects through a general continuous description using an affine invariant dictionary of basis function (BFs). This description is used to recognize multiple objects in images. In the learning phase, a template object is decomposed, and the extracted subset of BFs, called meta-atom, gives the description of the object. This description is then naturally extended into the linear scale-space using the definition of our BFs, and thus providing a more general representation of the object. We use this enhanced description as a predefined dictionary of the object to conduct an MP-based shape recognition task into the linear scale-space. The introduction of the scale-space approach improves the robustness of our method: we avoid local minima issues encountered when minimizing a nonconvex energy function. We show results for the detection of complex synthetic shapes, as well as real world (aerial and medical) images. © 2007 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 16, 162-180, 200
Development of an Autonomous Visual Perception System for Robots Using Object-Based Visual Attention
Contributions to the Content-Based Image Retrieval Using Pictorial Queris
L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca.Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal.La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final
- …