2 research outputs found
Automatic semantic parsing of the ground-plane in scenarios recorded with multiple moving cameras
Nowadays, video surveillance scenarios usually rely
on manually annotated focus areas to constrain automatic video
analysis tasks. Whereas manual annotation simplifies several
stages of the analysis, its use hinders the scalability of the developed
solutions and might induce operational problems in scenarios
recorded with Multiple and Moving Cameras (MMC). To
tackle these problems, an automatic method for the cooperative
extraction of Areas of Interest (AoIs) is proposed. Each captured
frame is segmented into regions with semantic roles using a stateof-
the-art method. Semantic evidences from different junctures,
cameras and points-of-view are then spatio-temporally aligned
on a common ground plane. Experimental results on widely-used
datasets recorded with multiple but static cameras suggest that
this process provides broader and more accurate AoIs than those
manually defined in the datasets. Moreover, the proposed method
naturally determines the projection of obstacles and functional
objects in the scene, paving the road towards systems focused on
the automatic analysis of human behaviour. To our knowledge,
this is the first study dealing with this problematic, as evidenced
by the lack of publicly available MMC benchmarks. To also cope
with this issue, we provide a new MMC dataset with associated
semantic scene annotationsThis study has been partially supported by the Spanish Government through
its TEC2014-53176-R HAVideo projec