4 research outputs found

    Nearest-neighbor based metric functions for indoor scene recognition

    Get PDF
    Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2011.Thesis (Master's) -- Bilkent University, 2011.Includes bibliographical references leaves 39-44.Indoor scene recognition is a challenging problem in the classical scene recognition domain due to the severe intra-class variations and inter-class similarities of man-made indoor structures. State-of-the-art scene recognition techniques such as capturing holistic representations of an image demonstrate low performance on indoor scenes. Other methods that introduce intermediate steps such as identifying objects and associating them with scenes have the handicap of successfully localizing and recognizing the objects in a highly cluttered and sophisticated environment. We propose a classi cation method that can handle such di culties of the problem domain by employing a metric function based on the nearest-neighbor classi cation procedure using the bag-of-visual words scheme, the so-called codebooks. Considering the codebook construction as a Voronoi tessellation of the feature space, we have observed that, given an image, a learned weighted distance of the extracted feature vectors to the center of the Voronoi cells gives a strong indication of the image's category. Our method outperforms state-of-the-art approaches on an indoor scene recognition benchmark and achieves competitive results on a general scene dataset, using a single type of descriptor. In this study although our primary focus is indoor scene categorization, we also employ the proposed metric function to create a baseline implementation for the auto-annotation problem. With the growing amount of digital media, the problem of auto-annotating images with semantic labels has received signi cant interest from researches in the last decade. Traditional approaches where such content is manually tagged has been found to be too tedious and a time-consuming process. Hence, succesfully labeling images with keywords describing the semantics is a crucial task yet to be accomplished.Çakır, FatihM.S

    Nearest-Neighbor based Metric Functions for indoor scene recognition

    Get PDF
    Indoor scene recognition is a challenging problem in the classical scene recognition domain due to the severe intra-class variations and inter-class similarities of man-made indoor structures. State-of-the-art scene recognition techniques such as capturing holistic representations of an image demonstrate low performance on indoor scenes. Other methods that introduce intermediate steps such as identifying objects and associating them with scenes have the handicap of successfully localizing and recognizing the objects in a highly cluttered and sophisticated environment. We propose a classification method that can handle such difficulties of the problem domain by employing a metric function based on the Nearest-Neighbor classification procedure using the bag-of-visual words scheme, the so-called codebooks. Considering the codebook construction as a Voronoi tessellation of the feature space, we have observed that, given an image, a learned weighted distance of the extracted feature vectors to the center of the Voronoi cells gives a strong indication of the image's category. Our method outperforms state-of-the-art approaches on an indoor scene recognition benchmark and achieves competitive results on a general scene dataset, using a single type of descriptor. © 2011 Elsevier Inc. All rights reserved

    Contribution to study and implementation of a bio-inspired perception system based on visual and auditory attention

    Get PDF
    The main goal of these researches is the design of one artificial perception system allowing to identify events or scenes in a complex environment. The work carried out during this thesis focused on the study and the conception of a bio-inspired perception system based on the both visual and auditory saliency. The main contributions of this thesis are auditory saliency with sound recognition and visual saliency with object recognition. The auditory saliency is computed by merging information from the both temporal and spectral signals with a saliency map of a spectrogram. The visual perception system is based on visual saliency and recognition of foreground object. In addition, the originality of the proposed approach is the possibility to do an evaluation of the coherence between visual and auditory observations using the obtained information from the features extracted from both visual and auditory patters. The experimental results have proven the interest of this method in the framework of scene identification in a complex environmentL'objectif principal de cette thèse porte sur la conception d'un système de perception artificiel permettant d'identifier des scènes ou évènements pertinents dans des environnements complexes. Les travaux réalisés ont permis d'étudier et de mettre en œuvre d'un système de perception bio-inspiré basé sur l'attention visuelle et auditive. Les principales contributions de cette thèse concernent la saillance auditive associée à une identification des sons et bruits environnementaux ainsi que la saillance visuelle associée à une reconnaissance d'objets pertinents. La saillance du signal sonore est calculée en fusionnant des informations extraites des représentations temporelles et spectrales du signal acoustique avec une carte de saillance visuelle du spectrogramme du signal concerné. Le système de perception visuelle est quant à lui composé de deux mécanismes distincts. Le premier se base sur des méthodes de saillance visuelle et le deuxième permet d'identifier l'objet en premier plan. D'autre part, l'originalité de notre approche est qu'elle permet d'évaluer la cohérence des observations en fusionnant les informations extraites des signaux auditifs et visuels perçus. Les résultats expérimentaux ont permis de confirmer l'intérêt des méthodes utilisées dans le cadre de l'identification de scènes pertinentes dans un environnement complex
    corecore