11 research outputs found

    A generalized Hausdorff distance based quality metric for point cloud geometry

    Full text link
    Reliable quality assessment of decoded point cloud geometry is essential to evaluate the compression performance of emerging point cloud coding solutions and guarantee some target quality of experience. This paper proposes a novel point cloud geometry quality assessment metric based on a generalization of the Hausdorff distance. To achieve this goal, the so-called generalized Hausdorff distance for multiple rankings is exploited to identify the best performing quality metric in terms of correlation with the MOS scores obtained from a subjective test campaign. The experimental results show that the quality metric derived from the classical Hausdorff distance leads to low objective-subjective correlation and, thus, fails to accurately evaluate the quality of decoded point clouds for emerging codecs. However, the quality metric derived from the generalized Hausdorff distance with an appropriately selected ranking, outperforms the MPEG adopted geometry quality metrics when decoded point clouds with different types of coding distortions are considered.Comment: This article is accepted to 12th International Conference on Quality of Multimedia Experience (QoMEX

    A DSP based H.264/SVC decoder for a multimedia terminal

    Get PDF
    International audienceIn this paper, the implementation of a DSP-based video decoder compliant with the H.264/SVC standard (14496-10 Annex G) is presented. A PC-based decoder implementation has been ported to a commercial DSP. Performance optimizations have been carried out improving the initial version performance about 40% and reaching real time for CIF sequences. Moreover, the performance has been characterized using H.264/SVC sequences with different kinds of scalabilities and different bitrates. This decoder will be the core of a multimedia terminal that will trade off energy against quality of experience

    Towards Modelling of Visual Saliency in Point Clouds for Immersive Applications

    Get PDF
    Modelling human visual attention is of great importance in the field of computer vision and has been widely explored for 3D imaging. Yet, in the absence of ground truth data, it is unclear whether such predictions are in alignment with the actual human viewing behavior in virtual reality environments. In this study, we work towards solving this problem by conducting an eye-tracking experiment in an immersive 3D scene that offers 6 degrees of freedom. A wide range of static point cloud models is inspected by human subjects, while their gaze is captured in real-time. The visual attention information is used to extract fixation density maps, that can be further exploited for saliency modelling. To obtain high quality fixation points, we devise a scheme that utilizes every recorded gaze measurement from the two eye-cameras of our set-up. The obtained fixation density maps together with the recorded gaze and head trajectories are made publicly available, to enrich visual saliency datasets for 3D models

    Network-on-Multi-Chip (NoMC) with Monitoring and Debugging Support, Journal of Telecommunications and Information Technology, 2011, nr 3

    Get PDF
    This paper summarizes recent research on network-on-multi-chip (NoMC) at Poznań University of Technology. The proposed network architecture supports hierarchical addressing and multicast transition mode. Such an approach provides new debugging functionality hardly attainable in classical hardware testing methodology. A multicast transmission also enables real-time packet monitoring. The introduced features of NoC network allow to elaborate a model of hardware video codec that utilizes distributed processing on many FPGAs. Final performance of the designed network was assessed using a model of AVC coder and multi-FPGA platforms. In such a system, the introduced multicast transmission mode yields overall gain of bandwidth up to 30%. Moreover, synthesis results show that the basic network components designed in Verilog language are suitable and easily synthesizable for FPGA devices

    Wavelet Image Compression for mobile/portable Application

    Get PDF

    Depth Image-Based Rendering With Advanced Texture Synthesis for 3-D Video

    Full text link

    Very low bit rate parametric audio coding

    Get PDF
    [no abstract

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
    corecore