191,993 research outputs found

    Vision-based toddler tracking at home

    Get PDF
    This paper presents a vision-based toddler tracking system for detecting risk factors of a toddler's fall within the home environment. The risk factors have environmental and behavioral aspects and the research in this paper focuses on the behavioral aspects. Apart from common image processing tasks such as background subtraction, the vision-based toddler tracking involves human classification, acquisition of motion and position information, and handling of regional merges and splits. The human classification is based on dynamic motion vectors of the human body. The center of mass of each contour is detected and connected with the closest center of mass in the next frame to obtain position, speed, and directional information. This tracking system is further enhanced by dealing with regional merges and splits due to multiple object occlusions. In order to identify the merges and splits, two directional detections of closest region centers are conducted between every two successive frames. Merges and splits of a single object due to errors in the background subtraction are also handled. The tracking algorithms have been developed, implemented and tested

    A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

    Full text link
    Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.Comment: accepted by CVPR 201

    A perceptual comparison of empirical and predictive region-of-interest video

    Get PDF
    When viewing multimedia presentations, a user only attends to a relatively small part of the video display at any one point in time. By shifting allocation of bandwidth from peripheral areas to those locations where a user’s gaze is more likely to rest, attentive displays can be produced. Attentive displays aim to reduce resource requirements while minimizing negative user perception—understood in this paper as not only a user’s ability to assimilate and understand information but also his/her subjective satisfaction with the video content. This paper introduces and discusses a perceptual comparison between two region-of-interest display (RoID) adaptation techniques. A RoID is an attentive display where bandwidth has been preallocated around measured or highly probable areas of user gaze. In this paper, video content was manipulated using two sources of data: empirical measured data (captured using eye-tracking technology) and predictive data (calculated from the physical characteristics of the video data). Results show that display adaptation causes significant variation in users’ understanding of specific multimedia content. Interestingly, RoID adaptation and the type of video being presented both affect user perception of video quality. Moreover, the use of frame rates less than 15 frames per second, for any video adaptation technique, caused a significant reduction in user perceived quality, suggesting that although users are aware of video quality reduction, it does impact level of information assimilation and understanding. Results also highlight that user level of enjoyment is significantly affected by the type of video yet is not as affected by the quality or type of video adaptation—an interesting implication in the field of entertainment
    corecore