3,023 research outputs found

    The aceToolbox: low-level audiovisual feature extraction for retrieval and classification

    Get PDF
    In this paper we present an overview of a software platform that has been developed within the aceMedia project, termed the aceToolbox, that provides global and local lowlevel feature extraction from audio-visual content. The toolbox is based on the MPEG-7 eXperimental Model (XM), with extensions to provide descriptor extraction from arbitrarily shaped image segments, thereby supporting local descriptors reflecting real image content. We describe the architecture of the toolbox as well as providing an overview of the descriptors supported to date. We also briefly describe the segmentation algorithm provided. We then demonstrate the usefulness of the toolbox in the context of two different content processing scenarios: similarity-based retrieval in large collections and scene-level classification of still images

    Optimal packetisation of MPEG-4 using RTP over mobile networks

    Get PDF
    The introduction of third-generation wireless networks should result in real-time mobile video communications becoming a reality. Delivery of such video is likely to be facilitated by the realtime transport protocol (RTP). Careful packetisation of the video data is necessary to ensure the optimal trade-off between channel utilisation and error robustness. Theoretical analyses for two basic schemes of MPEG-4 data encapsulation within RTP packets are presented. Simulations over a GPRS (general packet radio service) network are used to validate the analysis of the most efficient scheme. Finally, a motion adaptive system for deriving MPEG-4 video packet sizes is presented. Further simulations demonstrate the benefits of the adaptive system

    Semantic multimedia remote display for mobile thin clients

    Get PDF
    Current remote display technologies for mobile thin clients convert practically all types of graphical content into sequences of images rendered by the client. Consequently, important information concerning the content semantics is lost. The present paper goes beyond this bottleneck by developing a semantic multimedia remote display. The principle consists of representing the graphical content as a real-time interactive multimedia scene graph. The underlying architecture features novel components for scene-graph creation and management, as well as for user interactivity handling. The experimental setup considers the Linux X windows system and BiFS/LASeR multimedia scene technologies on the server and client sides, respectively. The implemented solution was benchmarked against currently deployed solutions (VNC and Microsoft-RDP), by considering text editing and WWW browsing applications. The quantitative assessments demonstrate: (1) visual quality expressed by seven objective metrics, e.g., PSNR values between 30 and 42 dB or SSIM values larger than 0.9999; (2) downlink bandwidth gain factors ranging from 2 to 60; (3) real-time user event management expressed by network round-trip time reduction by factors of 4-6 and by uplink bandwidth gain factors from 3 to 10; (4) feasible CPU activity, larger than in the RDP case but reduced by a factor of 1.5 with respect to the VNC-HEXTILE

    A video object generation tool allowing friendly user interaction

    Get PDF
    In this paper we describe an interactive video object segmentation tool developed in the framework of the ACTS-AC098 MOMUSYS project. The Video Object Generator with User Environment (VOGUE) combines three different sets of automatic and semi-automatic-tool (spatial segmentation, object tracking and temporal segmentation) with general purpose tools for user interaction. The result is an integrated environment allowing the user-assisted segmentation of any sort of video sequences in a friendly and efficient manner.Peer ReviewedPostprint (published version

    A Turbo-Detection Aided Serially Concatenated MPEG-4/TCM Videophone Transceiver

    No full text
    A Turbo-detection aided serially concatenated inner Trellis Coded Modulation (TCM) scheme is combined with four different outer codes, namely with a Reversible Variable Length Code (RVLC), a Non-Systematic Convolutional (NSC) code a Recursive Systematic Convolutional (RSC) code or a Low Density Parity Check (LDPC) code. These four outer constituent codes are comparatively studied in the context of an MPEG4 videophone transceiver. These serially concatenated schemes are also compared to a stand-alone LDPC coded MPEG4 videophone system at the same effective overall coding rate. The performance of the proposed schemes is evaluated when communicating over uncorrelated Rayleigh fading channels. It was found that the serially concatenated TCM-NSC scheme was the most attractive one in terms of coding gain and decoding complexity among all the schemes considered in the context of the MPEG4 videophone transceiver. By contrast, the serially concatenated TCM-RSC scheme was found to attain the highest iteration gain among the schemes considered
    corecore