6 research outputs found

    Indoor Outdoor Scene Classification in Digital Images

    Get PDF
    In this paper, we present a method to classify real-world digital images into indoor and outdoor scenes. Indoor class consists of four groups: bedroom, kitchen, laboratory and library. Outdoor class consists of four groups: landscape, roads, buildings and garden. Application considers real-time system and has a dedicated data-set. Input images are pre-processed and converted into gray-scale and is re-sized to “128x128” dimensions. Pre-processed images are sent to “Gabor filters”, which pre-computes filter transfer functions, which are performed on Fourier domain. The processed signal is finally sent to GIST feature extraction and the images are classified using “kNN classifier”. Most of the techniques have been based on the use of texture and color space features. As of date, we have been able to achieve 80% accuracy with respect to image classification

    Understanding near-duplicate videos: a user-centric approach

    Get PDF
    ABSTRACT Popular content in video sharing web sites (e.g., YouTube) is usually duplicated. Most scholars define near-duplicate video clips (NDVC) based on non-semantic features (e.g., different image/audio quality), while a few also include semantic features (different videos of similar content). However, it is unclear what features contribute to the human perception of similar videos. Findings of two large scale online surveys (N = 1003) confirm the relevance of both types of features. While some of our findings confirm the adopted definitions of NDVC, other findings are surprising. For example, videos that vary in visual content -by overlaying or inserting additional information-may not be perceived as near-duplicate versions of the original videos. Conversely, two different videos with distinct sounds, people, and scenarios were considered to be NDVC because they shared the same semantics (none of the pairs had additional information). Furthermore, the exact role played by semantics in relation to the features that make videos alike is still an open question. In most cases, participants preferred to see only one of the NDVC in the search results of a video search query and they were more tolerant to changes in the audio than in the video tracks. Finally, we propose a user-centric NDVC definition and present implications for how duplicate content should be dealt with by video sharing websites

    Méthodes probabilistes basées sur les mots visuels pour la reconnaissance de lieux sémantiques par un robot mobile.

    Get PDF
    Les êtres humains définissent naturellement leur espace quotidien en unités discrètes. Par exemple, nous sommes capables d'identifier le lieu où nous sommes (e.g. le bureau 205) et sa catégorie (i.e. un bureau), sur la base de leur seule apparence visuelle. Les travaux récents en reconnaissance de lieux sémantiques, visent à doter les robots de capacités similaires. Ces unités, appelées "lieux sémantiques", sont caractérisées par une extension spatiale et une unité fonctionnelle, ce qui distingue ce domaine des travaux habituels en cartographie. Nous présentons nos travaux dans le domaine de la reconnaissance de lieux sémantiques. Ces derniers ont plusieurs originalités par rapport à l'état de l'art. Premièrement, ils combinent la caractérisation globale d'une image, intéressante car elle permet de s'affranchir des variations locales de l'apparence des lieux, et les méthodes basées sur les mots visuels, qui reposent sur la classification non-supervisée de descripteurs locaux. Deuxièmement, et de manière intimement reliée, ils tirent parti du flux d'images fourni par le robot en utilisant des méthodes bayésiennes d'intégration temporelle. Dans un premier modèle, nous ne tenons pas compte de l'ordre des images. Le mécanisme d'intégration est donc particulièrement simple mais montre des difficultés à repérer les changements de lieux. Nous élaborons donc plusieurs mécanismes de détection des transitions entre lieux qui ne nécessitent pas d'apprentissage supplémentaire. Une deuxième version enrichit le formalisme classique du filtrage bayésien en utilisant l'ordre local d'apparition des images. Nous comparons nos méthodes à l'état de l'art sur des tâches de reconnaissance d'instances et de catégorisation, en utilisant plusieurs bases de données. Nous étudions l'influence des paramètres sur les performances et comparons les différents types de codage employés sur une même base.Ces expériences montrent que nos méthodes sont supérieures à l'état de l'art, en particulier sur les tâches de catégorisation.Human beings naturally organize their space as composed of discrete units. Those units, called "semantic places", are characterized by their spatial extend and their functional unity. Moreover, we are able to quickly recognize a given place (e.g. office 205) and its category (i.e. an office), solely on their visual appearance. Recent works in semantic place recognition seek to endow the robot with similar capabilities. Contrary to classical localization and mapping work, this problem is usually tackled as a supervised learning problem. Our contributions are two fold. First, we combine global image characterization, which captures the global organization of the image, and visual words methods which are usually based unsupervised classification of local signatures. Our second but closely related, contribution is to use several images for recognition by using Bayesian methods for temporal integration. Our first model don't use the natural temporal ordering of images. Temporal integration is very simple but has difficulties when the robot moves from one place to another.We thus develop several mechanisms to detect place transitions. Those mechanisms are simple and don't require additional learning. A second model augment the classical Bayesian filtering approach by using the local order among images. We compare our methods to state-of-the-art algorithms on place recognition and place categorization tasks.We study the influence of system parameters and compare the different global characterization methods on the same dataset. These experiments show that our approach while being simple leads to better results especially on the place categorization task.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF

    Towards the introduction of human perception in a natural scene classification system

    No full text
    International audienceIn this paper we develop a method to optimize a machine-based semantic categorization of natural images according to human perception. First, the categories are determined through a psychophysical experiment. The similarity matrices obtained from human responses are analyzed by a multidimensional scaling technique called Curvilinear Component Analysis (CCA). The same is done with an automatic image indexing system based on similarities between the outputs of Gabor filters applied to the images. Then we show that, by using the human categorization to balance the filter outputs, the system's performance may be significantly improved

    Towards the Introduction of Human Perception in a Natural Scene Classification System

    No full text
    In this paper we develop a method to optimize a machine-based semantic categorization of natural images according to human perception. First, the categories are determined through a psychophysical experiment. The similarity matrices obtained from human responses are analyzed by a multidimensional scaling technique called Curvilinear Component Analysis (CCA). The same is done with an automatic image indexing system based on similarities between the outputs of Gabor filters applied to the images. Then we show that, by using the human categorization to balance the filter outputs, the system's performance may be significantly improved
    corecore