84 research outputs found

    Metaverse: A Young Gamer's Perspective

    Full text link
    When developing technologies for the Metaverse, it is important to understand the needs and requirements of end users. Relatively little is known about the specific perspectives on the use of the Metaverse by the youngest audience: children ten and under. This paper explores the Metaverse from the perspective of a young gamer. It examines their understanding of the Metaverse in relation to the physical world and other technologies they may be familiar with, looks at some of their expectations of the Metaverse, and then relates these to the specific multimedia signal processing (MMSP) research challenges. The perspectives presented in the paper may be useful for planning more detailed subjective experiments involving young gamers, as well as informing the research on MMSP technologies targeted at these users.Comment: 6 pages, 5 figures, IEEE MMSP 202

    Oxidative stress is reduced in Wistar rats exposed to smoke from tobacco and treated with specific broad-band pulse electromagnetic fields

    Get PDF
    There have been a number of attempts to reduce the oxidative radical burden of tobacco. A recently patented technology, pulse electromagnetic technology, has been shown to induce differential action of treated tobacco products versus untreated products on the production of reactive oxygen species (ROS) in vivo. In a 90-day respiratory toxicity study, Wistar rats were exposed to cigarette smoke from processed and unprocessed tobacco and biomarkers of oxidative stress were compared with pathohistological analysis of rat lungs. Superoxide dismutase (SOD) activity was decreased in a dose-dependent manner to 81% in rats exposed to smoke from normal cigarettes compared to rats exposed to treated smoke or the control group. These results correspond to pathohistological analysis of rat lungs, in which those rats exposed to untreated smoke developed initial signs of emphysema, while rats exposed to treated smoke showed no pathology, as in the control group. The promise of inducing an improved health status in humans exposed to smoke from treated cigarettes merits further investigation

    LCCM-VC: Learned Conditional Coding Modes for Video Compression

    Full text link
    End-to-end learning-based video compression has made steady progress over the last several years. However, unlike learning-based image coding, which has already surpassed its handcrafted counterparts, learning-based video coding still has some ways to go. In this paper we present learned conditional coding modes for video coding (LCCM-VC), a video coding model that achieves state-of-the-art results among learning-based video coding methods. Our model utilizes conditional coding engines from the recent conditional augmented normalizing flows (CANF) pipeline, and introduces additional coding modes to improve compression performance. The compression efficiency is especially good in the high-quality/high-bitrate range, which is important for broadcast and video-on-demand streaming applications. The implementation of LCCM-VC is available at https://github.com/hadihdz/lccm_vcComment: 5 pages, 3 figures, IEEE ICASSP 202

    Adversarial Attacks and Defenses on 3D Point Cloud Classification: A Survey

    Full text link
    Deep learning has successfully solved a wide range of tasks in 2D vision as a dominant AI technique. Recently, deep learning on 3D point clouds is becoming increasingly popular for addressing various tasks in this field. Despite remarkable achievements, deep learning algorithms are vulnerable to adversarial attacks. These attacks are imperceptible to the human eye but can easily fool deep neural networks in the testing and deployment stage. To encourage future research, this survey summarizes the current progress on adversarial attack and defense techniques on point cloud classification.This paper first introduces the principles and characteristics of adversarial attacks and summarizes and analyzes adversarial example generation methods in recent years. Additionally, it provides an overview of defense strategies, organized into data-focused and model-focused methods. Finally, it presents several current challenges and potential future research directions in this domain

    Tensor Completion Methods for Collaborative Intelligence

    Get PDF
    In the race to bring Artificial Intelligence (AI) to the edge, collaborative intelligence has emerged as a promising way to lighten the computation load on edge devices that run applications based on Deep Neural Networks (DNNs). Typically, a deep model is split at a given layer into edge and cloud sub-models. The deep feature tensor produced by the edge sub-model is transmitted to the cloud, where the remaining computationally intensive workload is performed by the cloud sub-model. The communication channel between the edge and cloud is imperfect, which will result in missing data in the deep feature tensor received at the cloud side, an issue that has mostly been ignored by existing literature on the topic. In this paper we study four methods for recovering missing data in the deep feature tensor. Three of the studied methods are existing, generic tensor completion methods, and are adapted here to recover deep feature tensor data, while the fourth method is newly developed specifically for deep feature tensor completion. Simulation studies show that the new method is 3ā€“18 times faster than the other three methods, which is an important consideration in collaborative intelligence. For VGG16ā€™s sparse tensors, all methods produce statistically equivalent classification results across all loss levels tested. For ResNet34ā€™s non-sparse tensors, the new method offers statistically better classification accuracy (by 0.25%ā€“6.30%) compared to other methods for matched execution speeds, and second-best accuracy among the four methods when they are allowed to run until convergence

    Learned Scalable Video Coding For Humans and Machines

    Full text link
    Video coding has traditionally been developed to support services such as video streaming, videoconferencing, digital TV, and so on. The main intent was to enable human viewing of the encoded content. However, with the advances in deep neural networks (DNNs), encoded video is increasingly being used for automatic video analytics performed by machines. In applications such as automatic traffic monitoring, analytics such as vehicle detection, tracking and counting, would run continuously, while human viewing could be required occasionally to review potential incidents. To support such applications, a new paradigm for video coding is needed that will facilitate efficient representation and compression of video for both machine and human use in a scalable manner. In this manuscript, we introduce the first end-to-end learnable video codec that supports a machine vision task in its base layer, while its enhancement layer supports input reconstruction for human viewing. The proposed system is constructed based on the concept of conditional coding to achieve better compression gains. Comprehensive experimental evaluations conducted on four standard video datasets demonstrate that our framework outperforms both state-of-the-art learned and conventional video codecs in its base layer, while maintaining comparable performance on the human vision task in its enhancement layer. We will provide the implementation of the proposed system at www.github.com upon completion of the review process.Comment: 14 pages, 16 figure

    Scalable Human-Machine Point Cloud Compression

    Full text link
    Due to the limited computational capabilities of edge devices, deep learning inference can be quite expensive. One remedy is to compress and transmit point cloud data over the network for server-side processing. Unfortunately, this approach can be sensitive to network factors, including available bitrate. Luckily, the bitrate requirements can be reduced without sacrificing inference accuracy by using a machine task-specialized codec. In this paper, we present a scalable codec for point-cloud data that is specialized for the machine task of classification, while also providing a mechanism for human viewing. In the proposed scalable codec, the "base" bitstream supports the machine task, and an "enhancement" bitstream may be used for better input reconstruction performance for human viewing. We base our architecture on PointNet++, and test its efficacy on the ModelNet40 dataset. We show significant improvements over prior non-specialized codecs.Comment: 5 pages, 4 figures, 2024 Picture Coding Symposium (PCS

    Scalable Video Coding for Humans and Machines

    Full text link
    Video content is watched not only by humans, but increasingly also by machines. For example, machine learning models analyze surveillance video for security and traffic monitoring, search through YouTube videos for inappropriate content, and so on. In this paper, we propose a scalable video coding framework that supports machine vision (specifically, object detection) through its base layer bitstream and human vision via its enhancement layer bitstream. The proposed framework includes components from both conventional and Deep Neural Network (DNN)-based video coding. The results show that on object detection, the proposed framework achieves 13-19% bit savings compared to state-of-the-art video codecs, while remaining competitive in terms of MS-SSIM on the human vision task.Comment: 6 pages, 5 figures, IEEE MMSP 202
    • ā€¦
    corecore