8 research outputs found

    Object Category Detection using Audio-visual Cues

    Get PDF
    Categorization is one of the fundamental building blocks of cognitive systems. Object categorization has traditionally been addressed in the vision domain, even though cognitive agents are intrinsically multimodal. Indeed, biological systems combine several modalities in order to achieve robust categorization. In this paper we propose a multimodal approach to object category detection, using audio and visual information. The auditory channel is modeled on biologically motivated spectral features via a discriminative classifier. The visual channel is modeled by a state of the art part based model. Multimodality is achieved using two fusion schemes, one high level and the other low level. Experiments on six different object categories, under increasingly difficult conditions, show strengths and weaknesses of the two approaches, and clearly underline the open challenges for multimodal category detection

    Efficient learning of relational object class models

    No full text

    Efficient Learning of Relational Object Class Models

    No full text
    We present an efficient method for learning part-based object class models. The models include location and scale relations between parts, as well as part appearance. Models are learnt from raw object and background images, represented as an unordered set of features extracted using an interest point detector. The object class is generatively modeled using a simple Bayesian network with a central hidden node containing location and scale information, and nodes describing object parts. The model’s parameters, however, are optimized to reduce a loss function which reflects training error, as in discriminative methods. Specifically, the optimization is done using a boosting-like technique with complexity linear in the number of parts and the number of features per image. This efficiency allows our method to learn relational models with many parts and features, and leads to improved results when compared with other methods. Extensive experimental results are described, using some common bench-mark datasets and three sets of newly collected data, showing the relative advantage of our method.
    corecore