8 research outputs found

    A computational model of texture segmentation

    Get PDF
    An algorithm for finding texture boundaries in images is developed on the basis of a computational model of human texture perception. The model consists of three stages: (1) the image is convolved with a bank of even-symmetric linear filters followed by half-wave rectification to give a set of responses; (2) inhibition, localized in space, within and among the neural response profiles results in the suppression of weak responses when there are strong responses at the same or nearby locations; and (3) texture boundaries are detected using peaks in the gradients of the inhibited response profiles. The model is precisely specified, equally applicable to grey-scale and binary textures, and is motivated by detailed comparison with psychophysics and physiology. It makes predictions about the degree of discriminability of different texture pairs which match very well with experimental measurements of discriminability in human observers. From a machine-vision point of view, the scheme is a high-quality texture-edge detector which works equally on images of artificial and natural scenes. The algorithm makes the use of simple local and parallel operations, which makes it potentially real-time

    Terrain Classification using Multiple Image Features

    Get PDF
    A wide variety of image processing applications require segmentation and classification ofimages. The problem becomes complex when the images are obtained in an uncontrolledenvironment with a non-uniform illumination. The selection of suitable features is a critical partof an image segmentation and classification process, where the basic objective is to identify theimage regions that are homogeneous but dissimilar to all spatially adjacent regions. This paperproposes an automatic method for the classification of a terrain using image features such asintensity, texture, and edge. The textural features are calculated using statistics of geometricalattributes of connected regions in a sequence of binary images obtained from a texture image.A pixel-wise image segmentation scheme using a multi-resolution pyramid is used to correct thesegmentation process so as to get homogeneous image regions. Localisation of texture boundariesis done using a refined-edge map obtained by convolution, thinning, thresholding, and linking.The individual regions are classified using a database generated from the features extracted fromknown samples of the actual terrain. The algorithm is used to classify airborne images of a terrainobtained from the sensor mounted on an aerial reconnaissance platform and the results arepresented

    Detecting and localizing edges composed of steps, peaks and roofs

    Get PDF
    Caption title.Includes bibliographical references (p. 17-18).Research supported by the U.S. Army Research Office. DAAL01-86-K-0171Pietro Perona and Jitendra Malik

    An intuitive model of perceptual grouping for HCI design

    Get PDF
    ABSTRACT Understanding and exploiting the abilities of the human visual system is an important part of the design of usable user interfaces and information visualizations. Good design enables quick, easy and veridical perception of key components of that design. An important facet of human vision is its ability to seemingly effortlessly perform "perceptual organization"; it transforms individual feature estimates into perception of coherent regions, structures, and objects. We perceive regions grouped by proximity and feature similarity, grouping of curves by good continuation, and grouping of regions of coherent texture. In this paper, we discuss a simple model for a broad range of perceptual grouping phenomena. It takes as input an arbitrary image, and returns a structure describing the predicted visual organization of the image. We demonstrate that this model can capture aspects of traditional design rules, and predicts visual percepts in classic perceptual grouping displays

    Novel convolution kernels for computer vision and shape analysis based on electromagnetism

    Get PDF
    Computer vision is a growing field with a lot of new applications in automation and robotics, since it allows the analysis of images and shapes for the generation of numerical or analytical information. One of the most used method of information extraction is image filtering through convolution kernels, with each kernel specialized for specific applications. The objective of this paper is to present a novel convolution kernels, based on principles of electromagnetic potentials and fields, for a general use in computer vision and to demonstrate its usage for shape and stroke analysis. Such filtering possesses unique geometrical properties that can be interpreted using well understood physics theorems. Therefore, this paper focuses on the development of the electromagnetic kernels and on their application on images for shape and stroke analysis. It also presents several interesting features of electromagnetic kernels, such as resolution, size and orientation independence, robustness to noise and deformation, long distance stroke interaction and ability to work with 3D images

    Green Function and Electromagnetic Potential for Computer Vision and Convolutional Neural Network Applications

    Get PDF
    RÉSUMÉ Pour les problèmes de vision machine (CV) avancées, tels que la classification, la segmentation de scènes et la détection d’objets salients, il est nécessaire d’extraire le plus de caractéristiques possibles des images. Un des outils les plus utilisés pour l’extraction de caractéristiques est l’utilisation d’un noyau de convolution, où chacun des noyaux est spécialisé pour l’extraction d’une caractéristique donnée. Ceci a mené au développement récent des réseaux de neurones convolutionnels (CNN) qui permet d’optimiser des milliers de noyaux à la fois, faisant du CNN la norme pour l’analyse d’images. Toutefois, une limitation importante du CNN est que les noyaux sont petits (généralement de taille 3x3 à 7x7), ce qui limite l’interaction longue-distance des caractéristiques. Une autre limitation est que la fusion des caractéristiques se fait par des additions pondérées et des opérations de mise en commun (moyennes et maximums locaux). En effet, ces opérations ne permettent pas de fusionner des caractéristiques du domaine spatial avec des caractéristiques puisque ces caractéristiques occupent des positions éloignées sur l’image. L’objectif de cette thèse est de développer des nouveaux noyaux de convolutions basés sur l’électromagnétisme (EM) et les fonctions de Green (GF) pour être utilisés dans des applications de vision machine (CV) et dans des réseaux de neurones convolutionnels (CNN). Ces nouveaux noyaux sont au moins aussi grands que l’image. Ils évitent donc plusieurs des limitations des CNN standards puisqu’ils permettent l’interaction longue-distance entre les pixels de limages. De plus, ils permettent de fusionner les caractéristiques du domaine spatial avec les caractéristiques du domaine du gradient. Aussi, étant donné tout champ vectoriel, les nouveaux noyaux permettent de trouver le champ vectoriel conservatif le plus rapproché du champ initial, ce qui signifie que le nouveau champ devient lisse, irrotationnel et conservatif (intégrable par intégrale curviligne). Pour répondre à cet objectif, nous avons d’abord développé des noyaux convolutionnels symétriques et asymétriques basés sur les propriétés des EM et des GF et résultant en des noyaux qui sont invariants en résolution et en rotation. Ensuite, nous avons développé la première méthode qui permet de déterminer la probabilité d’inclusion dans des contours partiels, permettant donc d’extrapoler des contours fins en des régions continues couvrant l’espace 2D. De plus, la présente thèse démontre que les noyaux basés sur les GF sont les solveurs optimaux du gradient et du Laplacien.----------ABSTRACT For advanced computer vision (CV) tasks such as classification, scene segmentation, and salient object detection, extracting features from images is mandatory. One of the most used tools for feature extraction is the convolutional kernel, with each kernel being specialized for specific feature detection. In recent years, the convolutional neural network (CNN) became the standard method of feature detection since it allowed to optimize thousands of kernels at the same time. However, a limitation of the CNN is that all the kernels are small (usually between 3x3 and 7x7), which limits the receptive field. Another limitation is that feature merging is done via weighted additions and pooling, which cannot be used to merge spatial-domain features with gradient-domain features since they are not located at the same pixel coordinate. The objective of this thesis is to develop electromagnetic (EM) convolutions and Green’s functions (GF) convolutions to be used in Computer Vision and convolutional neural networks (CNN). These new kernels do not have the limitations of the standard CNN kernels since they allow an unlimited receptive field and interaction between any pixel in the image by using kernels bigger than the image. They allow merging spatial domain features with gradient domain features by integrating any vector field. Additionally, they can transform any vector field of features into its least-error conservative field, meaning that the field of features becomes smooth, irrotational and conservative (line-integrable). At first, we developed different symmetrical and asymmetrical convolutional kernel based on EM and GF that are both resolution and rotation invariant. Then we developed the first method of determining the probability of being inside partial edges, which allow extrapolating thin edge features into the full 2D space. Furthermore, the current thesis proves that GF kernels are the least-error gradient and Laplacian solvers, and they are empirically demonstrated to be faster than the fastest competing method and easier to implement. Consequently, using the fast gradient solver, we developed the first method that directly combines edges with saliency maps in the gradient domain, then solves the gradient to go back to the saliency domain. The improvement of the saliency maps over the F-measure is on average 6.6 times better than the nearest competing algorithm on a selected dataset. Then, to improve the saliency maps further, we developed the DSS-GIS model which combines edges with salient regions deep inside the network

    Attentional Selection in Object Recognition

    Get PDF
    A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object

    Developing an online support tool to assist students in higher education with project proposals

    Get PDF
    The research presented in this thesis investigates ways to assist students with writing their project proposals. There is limited literature on the problems students have when writing project proposals in Higher Education. Particularly most of the literature has concentrated on the writing aspects, rhetorical aspects and structure of a scientific article. Even though various studies on assessment of undergraduate individual and group project works have been done, the project proposal has not been given much attention. Therefore assessment of the proposal stage of the undergraduate final year project becomes the focus of this study, conducted over three years. This three-phase study directly involved three main stakeholders (students, supervisors and coordinators) in the overall process. In Phase 1, the existence of the proposal problems was investigated and identified from the perceptions of the students and supervisors. Possible solutions to the proposal problems were identified. Next Phase 2, I acknowledged the requirements of the stakeholders, which provided the framework and initiated the design and development of an eGuide, a self-paced online guide. The implementation and evaluation of the eGuide were then conducted in this phase. Finally Phase 3, the study emphasised improvement to practice focusing on the Degree final year project by utilizing the cyclic approach of an action research. Questionnaires and focus groups were used to gather information from students and supervisors, both to identify the problems they perceived with the student project proposal process and the effectiveness of the online support tool, eGuide. In the development of the eGuide, it proved necessary to design and pilot a robust rubric for students and supervisors to structure the project proposal process. The eGuide was evaluated for its effectiveness by the various users and followed by an action research approach to make further improvements to the Degree final year project curriculum. The assessment criteria evolved further to become a marking template with a very effective feedback tool. The study has a stimulating effect on the practices of how supervision of project proposal was shaped and how the project proposal was being assessed. Practical outcome of the study ultimately benefits not only the students who were the focus in the first place but also the supervisors and the coordinators. The study provides further avenues for research opportunities in this area to take place in the future
    corecore