7 research outputs found

    Geodesic Distance Histogram Feature for Video Segmentation

    Full text link
    This paper proposes a geodesic-distance-based feature that encodes global information for improved video segmentation algorithms. The feature is a joint histogram of intensity and geodesic distances, where the geodesic distances are computed as the shortest paths between superpixels via their boundaries. We also incorporate adaptive voting weights and spatial pyramid configurations to include spatial information into the geodesic histogram feature and show that this further improves results. The feature is generic and can be used as part of various algorithms. In experiments, we test the geodesic histogram feature by incorporating it into two existing video segmentation frameworks. This leads to significantly better performance in 3D video segmentation benchmarks on two datasets

    Feature-aware uniform tessellations on video manifold for content-sensitive supervoxels

    Get PDF
    Over-segmenting a video into supervoxels has strong potential to reduce the complexity of computer vision applications. Content-sensitive supervoxels (CSS) are typically smaller in content-dense regionsand larger in content-sparse regions. In this paper, we propose to compute feature-aware CSS (FCSS) that are regularly shaped 3D primitive volumes well aligned with local object/region/motion boundaries in video.To compute FCSS, we map a video to a 3-dimensional manifold, in which the volume elements of video manifold give a good measure of the video content density. Then any uniform tessellation on manifold can induce CSS. Our idea is that among all possible uniform tessellations, FCSS find one whose cell boundaries well align with local video boundaries. To achieve this goal, we propose a novel tessellation method that simultaneously minimizes the tessellation energy and maximizes the average boundary distance.Theoretically our method has an optimal competitive ratio O(1). We also present a simple extension of FCSS to streaming FCSS for processing long videos that cannot be loaded into main memory at once. We evaluate FCSS, streaming FCSS and ten representative supervoxel methods on four video datasets and two novel video applications. The results show that our method simultaneously achieves state-of-the-art performance with respect to various evaluation criteria

    Individual and group dynamic behaviour patterns in bound spaces

    Get PDF
    The behaviour analysis of individual and group dynamics in closed spaces is a subject of extensive research in both academia and industry. However, despite recent technological advancements the problem of implementing the existing methods for visual behaviour data analysis in production systems remains difficult and the applications are available only in special cases in which the resourcing is not a problem. Most of the approaches concentrate on direct extraction and classification of the visual features from the video footage for recognising the dynamic behaviour directly from the source. The adoption of such an approach allows recognising directly the elementary actions of moving objects, which is a difficult task on its own. The major factor that impacts the performance of the methods for video analytics is the necessity to combine processing of enormous volume of video data with complex analysis of this data using and computationally resourcedemanding analytical algorithms. This is not feasible for many applications, which must work in real time. In this research, an alternative simulation-based approach for behaviour analysis has been adopted. It can potentially reduce the requirements for extracting information from real video footage for the purpose of the analysis of the dynamic behaviour. This can be achieved by combining only limited data extracted from the original video footage with a symbolic data about the events registered on the scene, which is generated by 3D simulation synchronized with the original footage. Additionally, through incorporating some physical laws and the logics of dynamic behaviour directly in the 3D model of the visual scene, this framework allows to capture the behavioural patterns using simple syntactic pattern recognition methods. The extensive experiments with the prototype implementation prove in a convincing manner that the 3D simulation generates sufficiently rich data to allow analysing the dynamic behaviour in real-time with sufficient adequacy without the need to use precise physical data, using only a limited data about the objects on the scene, their location and dynamic characteristics. This research can have a wide applicability in different areas where the video analytics is necessary, ranging from public safety and video surveillance to marketing research to computer games and animation. Its limitations are linked to the dependence on some preliminary processing of the video footage which is still less detailed and computationally demanding than the methods which use directly the video frames of the original footage
    corecore