37,392 research outputs found

    Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

    Get PDF
    We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks

    Event-based Face Detection and Tracking in the Blink of an Eye

    Full text link
    We present the first purely event-based method for face detection using the high temporal resolution of an event-based camera. We will rely on a new feature that has never been used for such a task that relies on detecting eye blinks. Eye blinks are a unique natural dynamic signature of human faces that is captured well by event-based sensors that rely on relative changes of luminance. Although an eye blink can be captured with conventional cameras, we will show that the dynamics of eye blinks combined with the fact that two eyes act simultaneously allows to derive a robust methodology for face detection at a low computational cost and high temporal resolution. We show that eye blinks have a unique temporal signature over time that can be easily detected by correlating the acquired local activity with a generic temporal model of eye blinks that has been generated from a wide population of users. We furthermore show that once the face is reliably detected it is possible to apply a probabilistic framework to track the spatial position of a face for each incoming event while updating the position of trackers. Results are shown for several indoor and outdoor experiments. We will also release an annotated data set that can be used for future work on the topic
    • …
    corecore