746 research outputs found

    Supervised and unsupervised segmentation of textured images by efficient multi-level pattern classification

    Get PDF
    This thesis proposes new, efficient methodologies for supervised and unsupervised image segmentation based on texture information. For the supervised case, a technique for pixel classification based on a multi-level strategy that iteratively refines the resulting segmentation is proposed. This strategy utilizes pattern recognition methods based on prototypes (determined by clustering algorithms) and support vector machines. In order to obtain the best performance, an algorithm for automatic parameter selection and methods to reduce the computational cost associated with the segmentation process are also included. For the unsupervised case, the previous methodology is adapted by means of an initial pattern discovery stage, which allows transforming the original unsupervised problem into a supervised one. Several sets of experiments considering a wide variety of images are carried out in order to validate the developed techniques.Esta tesis propone metodologías nuevas y eficientes para segmentar imágenes a partir de información de textura en entornos supervisados y no supervisados. Para el caso supervisado, se propone una técnica basada en una estrategia de clasificación de píxeles multinivel que refina la segmentación resultante de forma iterativa. Dicha estrategia utiliza métodos de reconocimiento de patrones basados en prototipos (determinados mediante algoritmos de agrupamiento) y máquinas de vectores de soporte. Con el objetivo de obtener el mejor rendimiento, se incluyen además un algoritmo para selección automática de parámetros y métodos para reducir el coste computacional asociado al proceso de segmentación. Para el caso no supervisado, se propone una adaptación de la metodología anterior mediante una etapa inicial de descubrimiento de patrones que permite transformar el problema no supervisado en supervisado. Las técnicas desarrolladas en esta tesis se validan mediante diversos experimentos considerando una gran variedad de imágenes

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning

    Video metadata extraction in a videoMail system

    Get PDF
    Currently the world swiftly adapts to visual communication. Online services like YouTube and Vine show that video is no longer the domain of broadcast television only. Video is used for different purposes like entertainment, information, education or communication. The rapid growth of today’s video archives with sparsely available editorial data creates a big problem of its retrieval. The humans see a video like a complex interplay of cognitive concepts. As a result there is a need to build a bridge between numeric values and semantic concepts. This establishes a connection that will facilitate videos’ retrieval by humans. The critical aspect of this bridge is video annotation. The process could be done manually or automatically. Manual annotation is very tedious, subjective and expensive. Therefore automatic annotation is being actively studied. In this thesis we focus on the multimedia content automatic annotation. Namely the use of analysis techniques for information retrieval allowing to automatically extract metadata from video in a videomail system. Furthermore the identification of text, people, actions, spaces, objects, including animals and plants. Hence it will be possible to align multimedia content with the text presented in the email message and the creation of applications for semantic video database indexing and retrieving

    Hypothesis-based image segmentation for object learning and recognition

    Get PDF
    Denecke A. Hypothesis-based image segmentation for object learning and recognition. Bielefeld: Universität Bielefeld; 2010.This thesis addresses the figure-ground segmentation problem in the context of complex systems for automatic object recognition as well as for the online and interactive acquisition of visual representations. First the problem of image segmentation in general terms and next its importance for object learning in current state-of-the-art systems is introduced. Secondly a method using artificial neural networks is presented. This approach on the basis of Generalized Learning Vector Quantization is investigated in challenging scenarios such as the real-time figure-ground segmentation of complex shaped objects under continuously changing environment conditions. The ability to fulfill these requirements characterizes the novelty of the approach compared to state-of-the-art methods. Finally our technique is extended towards online adaption of model complexity and the integration of several segmentation cues. This yields a framework for object segmentation that is applicable to improve current systems for visual object learning and recognition

    Animal sound classification using dissimilarity spaces

    Get PDF
    The classifier system proposed in this work combines the dissimilarity spaces produced by a set of Siamese neural networks (SNNs) designed using four different backbones with different clustering techniques for training SVMs for automated animal audio classification. The system is evaluated on two animal audio datasets: one for cat and another for bird vocalizations. The proposed approach uses clustering methods to determine a set of centroids (in both a supervised and unsupervised fashion) from the spectrograms in the dataset. Such centroids are exploited to generate the dissimilarity space through the Siamese networks. In addition to feeding the SNNs with spectrograms, experiments process the spectrograms using the heterogeneous auto-similarities of characteristics. Once the similarity spaces are computed, each pattern is \u201cprojected\u201d into the space to obtain a vector space representation; this descriptor is then coupled to a support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Results demonstrate that the proposed approach performs competitively (without ad-hoc optimization of the clustering methods) on both animal vocalization datasets. To further demonstrate the power of the proposed system, the best standalone approach is also evaluated on the challenging Dataset for Environmental Sound Classification (ESC50) dataset

    Content based retrieval of PET neurological images

    Get PDF
    Medical image management has posed challenges to many researchers, especially when the images have to be indexed and retrieved using their visual content that is meaningful to clinicians. In this study, an image retrieval system has been developed for 3D brain PET (Position emission tomography) images. It has been found that PET neurological images can be retrieved based upon their diagnostic status using only data pertaining to their content, and predominantly the visual content. During the study PET scans are spatially normalized, using existing techniques, and their visual data is quantified. The mid-sagittal-plane of each individual 3D PET scan is found and then utilized in the detection of abnormal asymmetries, such as tumours or physical injuries. All the asymmetries detected are referenced to the Talairarch and Tournoux anatomical atlas. The Cartesian co- ordinates in Talairarch space, of detected lesion, are employed along with the associated anatomical structure(s) as the indices within the content based image retrieval system. The anatomical atlas is then also utilized to isolate distinct anatomical areas that are related to a number of neurodegenerative disorders. After segmentation of the anatomical regions of interest algorithms are applied to characterize the texture of brain intensity using Gabor filters and to elucidate the mean index ratio of activation levels. These measurements are combined to produce a single feature vector that is incorporated into the content based image retrieval system. Experimental results on images with known diagnoses show that physical lesions such as head injuries and tumours can be, to a certain extent, detected correctly. Images with correctly detected and measured lesion are then retrieved from the database of images when a query pertains to the measured locale. Images with neurodegenerative disorder patterns have been indexed and retrieved via texture-based features. Retrieval accuracy is increased, for images from patients diagnosed with dementia, by combining the texture feature and mean index ratio value
    corecore