2,654 research outputs found

    Segmentation of vectorial image features using shape gradients and information measures

    Get PDF
    International audienceIn this paper, we propose to focus on the segmentation of vectorial features (e.g. vector fields or color intensity) using region-based active contours. We search for a domain that minimizes a criterion based on homogeneity measures of the vectorial features. We choose to evaluate, within each region to be segmented, the average quantity of information carried out by the vectorial features, namely the joint entropy of vector components. We do not make any assumption on the underlying distribution of joint probability density functions of vector components, and so we evaluate the entropy using non parametric probability density functions. A local shape minimizer is then obtained through the evolution of a deformable domain in the direction of the shape gradient. The first contribution of this paper lies in the methodological approach used to differentiate such a criterion. This approach is mainly based on shape optimization tools. The second one is the extension of this method to vectorial data. We apply this segmentation method on color images for the segmentation of color homogeneous regions. We then focus on the segmentation of synthetic vector fields and show interesting results where motion vector fields may be separated using both their length and their direction. Then, optical flow is estimated in real video sequences and segmented using the proposed technique. This leads to promising results for the segmentation of moving video objects

    Region-based representations of image and video: segmentation tools for multimedia services

    Get PDF
    This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standards are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval.Peer ReviewedPostprint (published version

    Face Detection And Lip Localization

    Get PDF
    Integration of audio and video signals for automatic speech recognition has become an important field of study. The Audio-Visual Speech Recognition (AVSR) system is known to have accuracy higher than audio-only or visual-only system. The research focused on the visual front end and has been centered around lip segmentation. Experiments performed for lip feature extraction were mainly done in constrained environment with controlled background noise. In this thesis we focus our attention to a database collected in the environment of a moving car which hampered the quality of the imagery. We first introduce the concept of illumination compensation, where we try to reduce the dependency of light from over- or under-exposed images. As a precursor to lip segmentation, we focus on a robust face detection technique which reaches an accuracy of 95%. We have detailed and compared three different face detection techniques and found a successful way of concatenating them in order to increase the overall accuracy. One of the detection techniques used was the object detection algorithm proposed by Viola-Jones. We have experimented with different color spaces using the Viola-Jones algorithm and have reached interesting conclusions. Following face detection we implement a lip localization algorithm based on the vertical gradients of hybrid equations of color. Despite the challenging background and image quality, success rate of 88% was achieved for lip segmentation

    Advanced traffic video analytics for robust traffic accident detection

    Get PDF
    Automatic traffic accident detection is an important task in traffic video analysis due to its key applications in developing intelligent transportation systems. Reducing the time delay between the occurrence of an accident and the dispatch of the first responders to the scene may help lower the mortality rate and save lives. Since 1980, many approaches have been presented for the automatic detection of incidents in traffic videos. In this dissertation, some challenging problems for accident detection in traffic videos are discussed and a new framework is presented in order to automatically detect single-vehicle and intersection traffic accidents in real-time. First, a new foreground detection method is applied in order to detect the moving vehicles and subtract the ever-changing background in the traffic video frames captured by static or non-stationary cameras. For the traffic videos captured during day-time, the cast shadows degrade the performance of the foreground detection and road segmentation. A novel cast shadow detection method is therefore presented to detect and remove the shadows cast by moving vehicles and also the shadows cast by static objects on the road. Second, a new method is presented to detect the region of interest (ROI), which applies the location of the moving vehicles and the initial road samples and extracts the discriminating features to segment the road region. After detecting the ROI, the moving direction of the traffic is estimated based on the rationale that the crashed vehicles often make rapid change of direction. Lastly, single-vehicle traffic accidents and trajectory conflicts are detected using the first-order logic decision-making system. The experimental results using publicly available videos and a dataset provided by the New Jersey Department of Transportation (NJDOT) demonstrate the feasibility of the proposed methods. Additionally, the main challenges and future directions are discussed regarding (i) improving the performance of the foreground segmentation, (ii) reducing the computational complexity, and (iii) detecting other types of traffic accidents

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    Prediction of image partitions using Fourier descriptors: application to segmentation-based coding schemes

    Get PDF
    This paper presents a prediction technique for partition sequences. It uses a region-by-region approach that consists of four steps: region parameterization, region prediction, region ordering, and partition creation. The time evolution of each region is divided into two types: regular motion and shape deformation. Both types of evolution are parameterized by means of the Fourier descriptors and they are separately predicted in the Fourier domain. The final predicted partition is built from the ordered combination of the predicted regions, using morphological tools. With this prediction technique, two different applications are addressed in the context of segmentation-based coding approaches. Noncausal partition prediction is applied to partition interpolation, and examples using complete partitions are presented. In turn, causal partition prediction is applied to partition extrapolation for coding purposes, and examples using complete partitions as well as sequences of binary images-shape information in video object planes (VOPs)-are presented.Peer ReviewedPostprint (published version

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table
    corecore