16 research outputs found
Self-supervised Multi-Modal Video Forgery Attack Detection
Video forgery attack threatens the surveillance system by replacing the video
captures with unrealistic synthesis, which can be powered by the latest augment
reality and virtual reality technologies. From the machine perception aspect,
visual objects often have RF signatures that are naturally synchronized with
them during recording. In contrast to video captures, the RF signatures are
more difficult to attack given their concealed and ubiquitous nature. In this
work, we investigate multimodal video forgery attack detection methods using
both vision and wireless modalities. Since wireless signal-based human
perception is environmentally sensitive, we propose a self-supervised training
strategy to enable the system to work without external annotation and thus can
adapt to different environments. Our method achieves a perfect human detection
accuracy and a high forgery attack detection accuracy of 94.38% which is
comparable with supervised methods
MDPose:Human Skeletal Motion Reconstruction Using WiFi Micro-Doppler Signatures
Motion tracking systems based on optical sensors typically often suffer from
issues, such as poor lighting conditions, occlusion, limited coverage, and may
raise privacy concerns. More recently, radio frequency (RF)-based approaches
using commercial WiFi devices have emerged which offer low-cost ubiquitous
sensing whilst preserving privacy. However, the output of an RF sensing system,
such as Range-Doppler spectrograms, cannot represent human motion intuitively
and usually requires further processing. In this study, MDPose, a novel
framework for human skeletal motion reconstruction based on WiFi micro-Doppler
signatures, is proposed. It provides an effective solution to track human
activities by reconstructing a skeleton model with 17 key points, which can
assist with the interpretation of conventional RF sensing outputs in a more
understandable way. Specifically, MDPose has various incremental stages to
gradually address a series of challenges: First, a denoising algorithm is
implemented to remove any unwanted noise that may affect the feature extraction
and enhance weak Doppler signatures. Secondly, the convolutional neural network
(CNN)-recurrent neural network (RNN) architecture is applied to learn
temporal-spatial dependency from clean micro-Doppler signatures and restore key
points' velocity information. Finally, a pose optimising mechanism is employed
to estimate the initial state of the skeleton and to limit the increase of
error. We have conducted comprehensive tests in a variety of environments using
numerous subjects with a single receiver radar system to demonstrate the
performance of MDPose, and report 29.4mm mean absolute error over all key
points positions, which outperforms state-of-the-art RF-based pose estimation
systems
Understanding factors behind IoT privacy -- A user's perspective on RF sensors
While IoT sensors in physical spaces have provided utility and comfort in our
lives, their instrumentation in private and personal spaces has led to growing
concerns regarding privacy. The existing notion behind IoT privacy is that the
sensors whose data can easily be understood and interpreted by humans (such as
cameras) are more privacy-invasive than sensors that are not
human-understandable, such as RF (radio-frequency) sensors. However, given
recent advancements in machine learning, we can not only make sensitive
inferences on RF data but also translate between modalities. Thus, the existing
notions of privacy for IoT sensors need to be revisited. In this paper, our
goal is to understand what factors affect the privacy notions of a non-expert
user (someone who is not well-versed in privacy concepts). To this regard, we
conduct an online study of 162 participants from the USA to find out what
factors affect the privacy perception of a user regarding an RF-based device or
a sensor. Our findings show that a user's perception of privacy not only
depends upon the data collected by the sensor but also on the inferences that
can be made on that data, familiarity with the device and its form factor as
well as the control a user has over the device design and its data policies.
When the data collected by the sensor is not human-interpretable, it is the
inferences that can be made on the data and not the data itself that users care
about when making informed decisions regarding device privacy
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing
4D human perception plays an essential role in a myriad of applications, such
as home automation and metaverse avatar simulation. However, existing solutions
which mainly rely on cameras and wearable devices are either privacy intrusive
or inconvenient to use. To address these issues, wireless sensing has emerged
as a promising alternative, leveraging LiDAR, mmWave radar, and WiFi signals
for device-free human sensing. In this paper, we propose MM-Fi, the first
multi-modal non-intrusive 4D human dataset with 27 daily or rehabilitation
action categories, to bridge the gap between wireless sensing and high-level
human perception tasks. MM-Fi consists of over 320k synchronized frames of five
modalities from 40 human subjects. Various annotations are provided to support
potential sensing tasks, e.g., human pose estimation and action recognition.
Extensive experiments have been conducted to compare the sensing capacity of
each or several modalities in terms of multiple tasks. We envision that MM-Fi
can contribute to wireless sensing research with respect to action recognition,
human pose estimation, multi-modal learning, cross-modal supervision, and
interdisciplinary healthcare research.Comment: The paper has been accepted by NeurIPS 2023 Datasets and Benchmarks
Track. Project page: https://ntu-aiot-lab.github.io/mm-f
MUSE-Fi: Contactless MUti-person SEnsing Exploiting Near-field Wi-Fi Channel Variation
Having been studied for more than a decade, Wi-Fi human sensing still faces a
major challenge in the presence of multiple persons, simply because the limited
bandwidth of Wi-Fi fails to provide a sufficient range resolution to physically
separate multiple subjects. Existing solutions mostly avoid this challenge by
switching to radars with GHz bandwidth, at the cost of cumbersome deployments.
Therefore, could Wi-Fi human sensing handle multiple subjects remains an open
question. This paper presents MUSE-Fi, the first Wi-Fi multi-person sensing
system with physical separability. The principle behind MUSE-Fi is that, given
a Wi-Fi device (e.g., smartphone) very close to a subject, the near-field
channel variation caused by the subject significantly overwhelms variations
caused by other distant subjects. Consequently, focusing on the channel state
information (CSI) carried by the traffic in and out of this device naturally
allows for physically separating multiple subjects. Based on this principle, we
propose three sensing strategies for MUSE-Fi: i) uplink CSI, ii) downlink CSI,
and iii) downlink beamforming feedback, where we specifically tackle signal
recovery from sparse (per-user) traffic under realistic multi-user
communication scenarios. Our extensive evaluations clearly demonstrate that
MUSE-Fi is able to successfully handle multi-person sensing with respect to
three typical applications: respiration monitoring, gesture detection, and
activity recognition.Comment: 15 pages. Accepted by ACM MobiCom 202
Winect: 3D Human Pose Tracking for Free-form Activity Using Commodity WiFi
WiFi human sensing has become increasingly attractive in enabling emerging human-computer interaction applications. The corresponding technique has gradually evolved from the classification of multiple activity types to more fine-grained tracking of 3D human poses. However, existing WiFi-based 3D human pose tracking is limited to a set of predefined activities. In this work, we present Winect, a 3D human pose tracking system for free-form activity using commodity WiFi devices. Our system tracks free-form activity by estimating a 3D skeleton pose that consists of a set of joints of the human body. In particular, we combine signal separation and joint movement modeling to achieve free-form activity tracking. Our system first identifies the moving limbs by leveraging the two-dimensional angle of arrival of the signals reflected off the human body and separates the entangled signals for each limb. Then, it tracks each limb and constructs a 3D skeleton of the body by modeling the inherent relationship between the movements of the limb and the corresponding joints. Our evaluation results show that Winect is environment-independent and achieves centimeter-level accuracy for free-form activity tracking under various challenging environments including the none-line-of-sight (NLoS) scenarios