6 research outputs found

    Deep Learning-based Compressed Domain Multimedia for Man and Machine: A Taxonomy and Application to Point Cloud Classification

    Full text link
    In the current golden age of multimedia, human visualization is no longer the single main target, with the final consumer often being a machine which performs some processing or computer vision tasks. In both cases, deep learning plays a undamental role in extracting features from the multimedia representation data, usually producing a compressed representation referred to as latent representation. The increasing development and adoption of deep learning-based solutions in a wide area of multimedia applications have opened an exciting new vision where a common compressed multimedia representation is used for both man and machine. The main benefits of this vision are two-fold: i) improved performance for the computer vision tasks, since the effects of coding artifacts are mitigated; and ii) reduced computational complexity, since prior decoding is not required. This paper proposes the first taxonomy for designing compressed domain computer vision solutions driven by the architecture and weights compatibility with an available spatio-temporal computer vision processor. The potential of the proposed taxonomy is demonstrated for the specific case of point cloud classification by designing novel compressed domain processors using the JPEG Pleno Point Cloud Coding standard under development and adaptations of the PointGrid classifier. Experimental results show that the designed compressed domain point cloud classification solutions can significantly outperform the spatial-temporal domain classification benchmarks when applied to the decompressed data, containing coding artifacts, and even surpass their performance when applied to the original uncompressed data

    Constant Size Point Cloud Clustering: a Compact, Non-Overlapping Solution

    No full text
    Point clouds have recently become a popular 3D representation model for many application domains, notably virtual and augmented reality. Since point cloud data is often very large, processing a point cloud may require that it be segmented into smaller clusters. For example, the input to deep learning-based methods like auto-encoders should be constant size point cloud clusters, which are ideally compact and non-overlapping. However, given the unorganized nature of point clouds, defining the specific data segments to code is not always trivial. This paper proposes a point cloud clustering algorithm which targets five main goals: i) clusters with a constant number of points; ii) compact clusters, i.e. with low dispersion; iii) non-overlapping clusters, i.e. not intersecting each other; iv) ability to scale with the number of points; and v) low complexity. After appropriate initialization, the proposed algorithm transfers points between neighboring clusters as a propagation wave, filling or emptying clusters until they achieve the same size. The proposed algorithm is unique since there is no other point cloud clustering method available in the literature offering the same clustering features for large point clouds at such low complexityinfo:eu-repo/semantics/acceptedVersio

    Coding of Still Pictures

    No full text
    This document reports the performance results of the first version of the JPEG Pleno Point Cloud Coding Verification Model under Consideration, following the Call for Proposals on JPEG Pleno Point Cloud Coding issued in January 2022 [1].N/

    ISO/IEC JTC 1/SC 29/WG 1 (ITU-T SG16)

    No full text
    This document describes the JPEG Pleno Point Cloud Coding [1] Verification Model (VM), consisting of a deep learning (DL)-based joint point cloud (PC) geometry and colour codec [2].N/

    Coding of Still Pictures

    No full text
    This document describes a deep learning (DL)-based point cloud (PC) geometry codec and a DL-based PC joint geometry and colour codec, submitted to the Call for Proposals on JPEG Pleno Point Cloud Coding issued in January 2022 [1]. These proposals have been originated by research developed at Instituto de TelecomunicaçÔes (IT), in the context of the project Deep-PCR entitled “Deep learning-based Point Cloud Representation” (PTDC/EEI-COM/1125/2021), financed by Fundação para a CiĂȘncia e Tecnologia (FCT).N/
    corecore