84 research outputs found
Metaverse: A Young Gamer's Perspective
When developing technologies for the Metaverse, it is important to understand
the needs and requirements of end users. Relatively little is known about the
specific perspectives on the use of the Metaverse by the youngest audience:
children ten and under. This paper explores the Metaverse from the perspective
of a young gamer. It examines their understanding of the Metaverse in relation
to the physical world and other technologies they may be familiar with, looks
at some of their expectations of the Metaverse, and then relates these to the
specific multimedia signal processing (MMSP) research challenges. The
perspectives presented in the paper may be useful for planning more detailed
subjective experiments involving young gamers, as well as informing the
research on MMSP technologies targeted at these users.Comment: 6 pages, 5 figures, IEEE MMSP 202
Oxidative stress is reduced in Wistar rats exposed to smoke from tobacco and treated with specific broad-band pulse electromagnetic fields
There have been a number of attempts to reduce the oxidative radical burden of tobacco. A recently patented technology, pulse electromagnetic technology, has been shown to induce differential action of treated tobacco products versus untreated products on the production of reactive oxygen species (ROS) in vivo. In a 90-day respiratory toxicity study, Wistar rats were exposed to cigarette smoke from processed and unprocessed tobacco and biomarkers of oxidative stress were compared with pathohistological analysis of rat lungs. Superoxide dismutase (SOD) activity was decreased in a dose-dependent manner to 81% in rats exposed to smoke from normal cigarettes compared to rats exposed to treated smoke or the control group. These results correspond to pathohistological analysis of rat lungs, in which those rats exposed to untreated smoke developed initial signs of emphysema, while rats exposed to treated smoke showed no pathology, as in the control group. The promise of inducing an improved health status in humans exposed to smoke from treated cigarettes merits further investigation
LCCM-VC: Learned Conditional Coding Modes for Video Compression
End-to-end learning-based video compression has made steady progress over the
last several years. However, unlike learning-based image coding, which has
already surpassed its handcrafted counterparts, learning-based video coding
still has some ways to go. In this paper we present learned conditional coding
modes for video coding (LCCM-VC), a video coding model that achieves
state-of-the-art results among learning-based video coding methods. Our model
utilizes conditional coding engines from the recent conditional augmented
normalizing flows (CANF) pipeline, and introduces additional coding modes to
improve compression performance. The compression efficiency is especially good
in the high-quality/high-bitrate range, which is important for broadcast and
video-on-demand streaming applications. The implementation of LCCM-VC is
available at https://github.com/hadihdz/lccm_vcComment: 5 pages, 3 figures, IEEE ICASSP 202
Adversarial Attacks and Defenses on 3D Point Cloud Classification: A Survey
Deep learning has successfully solved a wide range of tasks in 2D vision as a
dominant AI technique. Recently, deep learning on 3D point clouds is becoming
increasingly popular for addressing various tasks in this field. Despite
remarkable achievements, deep learning algorithms are vulnerable to adversarial
attacks. These attacks are imperceptible to the human eye but can easily fool
deep neural networks in the testing and deployment stage. To encourage future
research, this survey summarizes the current progress on adversarial attack and
defense techniques on point cloud classification.This paper first introduces
the principles and characteristics of adversarial attacks and summarizes and
analyzes adversarial example generation methods in recent years. Additionally,
it provides an overview of defense strategies, organized into data-focused and
model-focused methods. Finally, it presents several current challenges and
potential future research directions in this domain
Tensor Completion Methods for Collaborative Intelligence
In the race to bring Artificial Intelligence (AI) to the edge, collaborative intelligence has emerged as a promising way to lighten the computation load on edge devices that run applications based on Deep Neural Networks (DNNs). Typically, a deep model is split at a given layer into edge and cloud sub-models. The deep feature tensor produced by the edge sub-model is transmitted to the cloud, where the remaining computationally intensive workload is performed by the cloud sub-model. The communication channel between the edge and cloud is imperfect, which will result in missing data in the deep feature tensor received at the cloud side, an issue that has mostly been ignored by existing literature on the topic. In this paper we study four methods for recovering missing data in the deep feature tensor. Three of the studied methods are existing, generic tensor completion methods, and are adapted here to recover deep feature tensor data, while the fourth method is newly developed specifically for deep feature tensor completion. Simulation studies show that the new method is 3ā18 times faster than the other three methods, which is an important consideration in collaborative intelligence. For VGG16ās sparse tensors, all methods produce statistically equivalent classification results across all loss levels tested. For ResNet34ās non-sparse tensors, the new method offers statistically better classification accuracy (by 0.25%ā6.30%) compared to other methods for matched execution speeds, and second-best accuracy among the four methods when they are allowed to run until convergence
Learned Scalable Video Coding For Humans and Machines
Video coding has traditionally been developed to support services such as
video streaming, videoconferencing, digital TV, and so on. The main intent was
to enable human viewing of the encoded content. However, with the advances in
deep neural networks (DNNs), encoded video is increasingly being used for
automatic video analytics performed by machines. In applications such as
automatic traffic monitoring, analytics such as vehicle detection, tracking and
counting, would run continuously, while human viewing could be required
occasionally to review potential incidents. To support such applications, a new
paradigm for video coding is needed that will facilitate efficient
representation and compression of video for both machine and human use in a
scalable manner. In this manuscript, we introduce the first end-to-end
learnable video codec that supports a machine vision task in its base layer,
while its enhancement layer supports input reconstruction for human viewing.
The proposed system is constructed based on the concept of conditional coding
to achieve better compression gains. Comprehensive experimental evaluations
conducted on four standard video datasets demonstrate that our framework
outperforms both state-of-the-art learned and conventional video codecs in its
base layer, while maintaining comparable performance on the human vision task
in its enhancement layer. We will provide the implementation of the proposed
system at www.github.com upon completion of the review process.Comment: 14 pages, 16 figure
Scalable Human-Machine Point Cloud Compression
Due to the limited computational capabilities of edge devices, deep learning
inference can be quite expensive. One remedy is to compress and transmit point
cloud data over the network for server-side processing. Unfortunately, this
approach can be sensitive to network factors, including available bitrate.
Luckily, the bitrate requirements can be reduced without sacrificing inference
accuracy by using a machine task-specialized codec. In this paper, we present a
scalable codec for point-cloud data that is specialized for the machine task of
classification, while also providing a mechanism for human viewing. In the
proposed scalable codec, the "base" bitstream supports the machine task, and an
"enhancement" bitstream may be used for better input reconstruction performance
for human viewing. We base our architecture on PointNet++, and test its
efficacy on the ModelNet40 dataset. We show significant improvements over prior
non-specialized codecs.Comment: 5 pages, 4 figures, 2024 Picture Coding Symposium (PCS
Scalable Video Coding for Humans and Machines
Video content is watched not only by humans, but increasingly also by
machines. For example, machine learning models analyze surveillance video for
security and traffic monitoring, search through YouTube videos for
inappropriate content, and so on. In this paper, we propose a scalable video
coding framework that supports machine vision (specifically, object detection)
through its base layer bitstream and human vision via its enhancement layer
bitstream. The proposed framework includes components from both conventional
and Deep Neural Network (DNN)-based video coding. The results show that on
object detection, the proposed framework achieves 13-19% bit savings compared
to state-of-the-art video codecs, while remaining competitive in terms of
MS-SSIM on the human vision task.Comment: 6 pages, 5 figures, IEEE MMSP 202
- ā¦