7 research outputs found

    MPEG-7 Description of Generic Video Objects for Scene Reconstruction

    Get PDF
    ABSTRACT We present an MPEG-7 compliant description of generic video sequences aiming at their scalable transmission and reconstruction. The proposed method allows efficient and flexible video coding while keeping the advantages of textual descriptions in database applications. Visual objects are described in terms of their shape, color, texture and motion; these features can be extracted automatically and are sufficient in a wide range of applications. To permit partial sequence reconstruction, at least one simple qualitative as well as a quantitative descriptor is provided for each feature. In addition, we propose a structure for the organization of the descriptors into objects and scenes and some possible applications for our method. Experimental results obtained with news and video surveillance sequences validate our method and highlight its main features

    On-line adaptive video sequence transmission based on generation and transmisiĂłn of descriptions

    Full text link
    Proceedings of the 26th Picture Coding Symposium, PCS 2007, Lisbon, Portugal, November 2007This paper presents a system to transmit the information from a static surveillance camera in an adaptive way, from low to higher bit-rate, based on the on-line generation of descriptions. The proposed system is based on a server/client model: the server is placed in the surveillance area and the client is placed in a user side. The server analyzes the video sequence to detect the regions of activity (motion analysis) and the corresponding descriptions (mainly MPEG-7 moving regions) are generated together with the textures of moving regions and the associated background image. Depending on the available bandwidth, different levels of transmission are specified, ranging from just sending the descriptions generated to a transmission with all the associated images corresponding to the moving objects and background.This work is partially supported by Cátedra Infoglobal-UAM para Nuevas Tecnologías de video aplicadas a la seguridad. This work is also supported by the Ministerio de Ciencia y Tecnología of the Spanish Government under project TIN2004-07860 (MEDUSA) and by the Comunidad de Madrid under project P-TIC-0223-0505 (PROMULTIDIS)

    Design and Implementation of a Universal Multimedia Access Environment

    Get PDF
    The objective of Universal Multimedia Access (UMA) is to permit any user equipped with whatever device the access to multimedia information. To handle the problems of UMA, two different approaches are commonly used: store variations of the same content and send the most appropriate one, and store the original content and adapt it on-the-fly. In this project a UMA environment using a mixture of both approaches is proposed. The tools allowing to reach this goal are: an Annotation Tool which describes media using MPEG-7, and a Client-Server Application that takes all the steps for the browsing and retrieval of media. After an overview of the designed UMA system, the function of the MPEG-7 annotation tool is explained. In particular, a descriptor list for content annotation is proposed. These descriptors are meant for content as well as for media feature description. The client-server application is then explained. Particular insight is given into the handling of user preferences and device capabilities. Finally, the UMA environment is tested on a Personal Computer simulating diverse devices and users. These tests show that the system behaves as expected and that possible extensions and improvements can be added

    MPEG-7 Description of Generic Video Objects for Scene Reconstruction

    No full text
    We present an MPEG7 compliant description of generic video sequences aiming at their scalable transmission and reconstruction. The proposed method allows efficient and flexible video coding while keeping the advantages of textual descriptions in database applications. Visual objects are described in terms of their shape, color, texture and motion; these features can be extracted automatically and are sufficient in a wide range of applications. To permit partial sequence reconstruction, at least one simple qualitative as well as a quantitative descriptor is provided for each feature. In addition, we propose a structure for the organization of the descriptors into objects and scenes and some possible applications for our method. Experimental results obtained with news and video surveillance sequences validate our method and highlight its main features

    MPEG-7 description of generic video objects for scene reconstruction

    No full text

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    Shadow segmentation and tracking in real-world conditions

    Get PDF
    Visual information, in the form of images and video, comes from the interaction of light with objects. Illumination is a fundamental element of visual information. Detecting and interpreting illumination effects is part of our everyday life visual experience. Shading for instance allows us to perceive the three-dimensional nature of objects. Shadows are particularly salient cues for inferring depth information. However, we do not make any conscious or unconscious effort to avoid them as if they were an obstacle when we walk around. Moreover, when humans are asked to describe a picture, they generally omit the presence of illumination effects, such as shadows, shading, and highlights, to give a list of objects and their relative position in the scene. Processing visual information in a way that is close to what the human visual system does, thus being aware of illumination effects, represents a challenging task for computer vision systems. Illumination phenomena interfere in fact with fundamental tasks in image analysis and interpretation applications, such as object extraction and description. On the other hand, illumination conditions are an important element to be considered when creating new and richer visual content that combines objects from different sources, both natural and synthetic. When taken into account, illumination effects can play an important role in achieving realism. Among illumination effects, shadows are often integral part of natural scenes and one of the elements contributing to naturalness of synthetic scenes. In this thesis, the problem of extracting shadows from digital images is discussed. A new analysis method for the segmentation of cast shadows in still and moving images without the need of human supervision is proposed. The problem of separating moving cast shadows from moving objects in image sequences is particularly relevant for an always wider range of applications, ranging from video analysis to video coding, and from video manipulation to interactive environments. Therefore, particular attention has been dedicated to the segmentation of shadows in video. The validity of the proposed approach is however also demonstrated through its application to the detection of cast shadows in still color images. Shadows are a difficult phenomenon to model. Their appearance changes with changes in the appearance of the surface they are cast upon. It is therefore important to exploit multiple constraints derived from the analysis of the spectral, geometric and temporal properties of shadows to develop effective techniques for their extraction. The proposed method combines an analysis of color information and of photometric invariant features to a spatio-temporal verification process. With regards to the use of color information for shadow analysis, a complete picture of the existing solutions is provided, which points out the fundamental assumptions, the adopted color models and the link with research problems such as computational color constancy and color invariance. The proposed spatial verification does not make any assumption about scene geometry nor about object shape. The temporal analysis is based on a novel shadow tracking technique. On the basis of the tracking results, a temporal reliability estimation of shadows is proposed which allows to discard shadows which do not present time coherence. The proposed approach is general and can be applied to a wide class of applications and input data. The proposed cast shadow segmentation method has been evaluated on a number of different video data representing indoor and outdoor real-world environments. The obtained results have confirmed the validity of the approach, in particular its ability to deal with different types of content and its robustness to different physically important independent variables, and have demonstrated the improvement with respect to the state of the art. Examples of application of the proposed shadow segmentation tool to the enhancement of video object segmentation, tracking and description operations, and to video composition, have demonstrated the advantages of a shadow-aware video processing
    corecore