249 research outputs found

    Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

    Get PDF
    This paper addresses the task of efficient object class detection by means of the Hough transform. This approach has been made popular by the Implicit Shape Model (ISM) and has been adopted many times. Although ISM exhibits robust detection performance, its probabilistic formulation is unsatisfactory. The PRincipled Implicit Shape Model (PRISM) overcomes these problems by interpreting Hough voting as a dual implementation of linear sliding-window detection. It thereby gives a sound justification to the voting procedure and imposes minimal constraints. We demonstrate PRISM's flexibility by two complementary implementations: a generatively trained Gaussian Mixture Model as well as a discriminatively trained histogram approach. Both systems achieve state-of-the-art performance. Detections are found by gradient-based or branch and bound search, respectively. The latter greatly benefits from PRISM's feature-centric view. It thereby avoids the unfavourable memory trade-off and any on-line pre-processing of the original Efficient Subwindow Search (ESS). Moreover, our approach takes account of the features' scale value while ESS does not. Finally, we show how to avoid soft-matching and spatial pyramid descriptors during detection without losing their positive effect. This makes algorithms simpler and faster. Both are possible if the object model is properly regularised and we discuss a modification of SVMs which allows for doing s

    Spotlight the Negatives: A Generalized Discriminative Latent Model

    Full text link
    Discriminative latent variable models (LVM) are frequently applied to various visual recognition tasks. In these systems the latent (hidden) variables provide a formalism for modeling structured variation of visual features. Conventionally, latent variables are de- fined on the variation of the foreground (positive) class. In this work we augment LVMs to include negative latent variables corresponding to the background class. We formalize the scoring function of such a generalized LVM (GLVM). Then we discuss a framework for learning a model based on the GLVM scoring function. We theoretically showcase how some of the current visual recognition methods can benefit from this generalization. Finally, we experiment on a generalized form of Deformable Part Models with negative latent variables and show significant improvements on two different detection tasks.Comment: Published in proceedings of BMVC 201

    Recognizing flu-like symptoms from videos

    Get PDF
    © 2014 Hue Thi et al.; licensee BioMed Central Ltd. Background: Vision-based surveillance and monitoring is a potential alternative for early detection of respiratory disease outbreaks in urban areas complementing molecular diagnostics and hospital and doctor visit-based alert systems. Visible actions representing typical flu-like symptoms include sneeze and cough that are associated with changing patterns of hand to head distances, among others. The technical difficulties lie in the high complexity and large variation of those actions as well as numerous similar background actions such as scratching head, cell phone use, eating, drinking and so on. Results: In this paper, we make a first attempt at the challenging problem of recognizing flu-like symptoms from videos. Since there was no related dataset available, we created a new public health dataset for action recognition that includes two major flu-like symptom related actions (sneeze and cough) and a number of background actions. We also developed a suitable novel algorithm by introducing two types of Action Matching Kernels, where both types aim to integrate two aspects of local features, namely the space-time layout and the Bag-of-Words representations. In particular, we show that the Pyramid Match Kernel and Spatial Pyramid Matching are both special cases of our proposed kernels. Besides experimenting on standard testbed, the proposed algorithm is evaluated also on the new sneeze and cough set. Empirically, we observe that our approach achieves competitive performance compared to the state-of-the-arts, while recognition on the new public health dataset is shown to be a non-trivial task even with simple single person unobstructed view. Conclusions: Our sneeze and cough video dataset and newly developed action recognition algorithm is the first of its kind and aims to kick-start the field of action recognition of flu-like symptoms from videos. It will be challenging but necessary in future developments to consider more complex real-life scenario of detecting these actions simultaneously from multiple persons in possibly crowded environments
    • …
    corecore