53,326 research outputs found
Seeing Tree Structure from Vibration
Humans recognize object structure from both their appearance and motion;
often, motion helps to resolve ambiguities in object structure that arise when
we observe object appearance only. There are particular scenarios, however,
where neither appearance nor spatial-temporal motion signals are informative:
occluding twigs may look connected and have almost identical movements, though
they belong to different, possibly disconnected branches. We propose to tackle
this problem through spectrum analysis of motion signals, because vibrations of
disconnected branches, though visually similar, often have distinctive natural
frequencies. We propose a novel formulation of tree structure based on a
physics-based link model, and validate its effectiveness by theoretical
analysis, numerical simulation, and empirical experiments. With this
formulation, we use nonparametric Bayesian inference to reconstruct tree
structure from both spectral vibration signals and appearance cues. Our model
performs well in recognizing hierarchical tree structure from real-world videos
of trees and vessels.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://tree.csail.mit.edu
Unobtrusive and pervasive video-based eye-gaze tracking
Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe
Towards robots reasoning about group behavior of museum visitors: leader detection and group tracking
The final publication is available at IOS Press through http://dx.doi.org/10.3233/AIS-170467Peer ReviewedPostprint (author's final draft
Contextual cropping and scaling of TV productions
This is the author's accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/s11042-011-0804-3. Copyright @ Springer Science+Business Media, LLC 2011.In this paper, an application is presented which automatically adapts SDTV (Standard Definition Television) sports productions to smaller displays through intelligent cropping and scaling. It crops regions of interest of sports productions based on a smart combination of production metadata and systematic video analysis methods. This approach allows a context-based composition of cropped images. It provides a differentiation between the original SD version of the production and the processed one adapted to the requirements for mobile TV. The system has been comprehensively evaluated by comparing the outcome of the proposed method with manually and statically cropped versions, as well as with non-cropped versions. Envisaged is the integration of the tool in post-production and live workflows
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- …