18,053 research outputs found
Security and Privacy Problems in Voice Assistant Applications: A Survey
Voice assistant applications have become omniscient nowadays. Two models that
provide the two most important functions for real-life applications (i.e.,
Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR)
models and Speaker Identification (SI) models. According to recent studies,
security and privacy threats have also emerged with the rapid development of
the Internet of Things (IoT). The security issues researched include attack
techniques toward machine learning models and other hardware components widely
used in voice assistant applications. The privacy issues include technical-wise
information stealing and policy-wise privacy breaches. The voice assistant
application takes a steadily growing market share every year, but their privacy
and security issues never stopped causing huge economic losses and endangering
users' personal sensitive information. Thus, it is important to have a
comprehensive survey to outline the categorization of the current research
regarding the security and privacy problems of voice assistant applications.
This paper concludes and assesses five kinds of security attacks and three
types of privacy threats in the papers published in the top-tier conferences of
cyber security and voice domain.Comment: 5 figure
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Joint Activity Detection, Channel Estimation, and Data Decoding for Grant-free Massive Random Access
In the massive machine-type communication (mMTC) scenario, a large number of
devices with sporadic traffic need to access the network on limited radio
resources. While grant-free random access has emerged as a promising mechanism
for massive access, its potential has not been fully unleashed. In particular,
the common sparsity pattern in the received pilot and data signal has been
ignored in most existing studies, and auxiliary information of channel decoding
has not been utilized for user activity detection. This paper endeavors to
develop advanced receivers in a holistic manner for joint activity detection,
channel estimation, and data decoding. In particular, a turbo receiver based on
the bilinear generalized approximate message passing (BiG-AMP) algorithm is
developed. In this receiver, all the received symbols will be utilized to
jointly estimate the channel state, user activity, and soft data symbols, which
effectively exploits the common sparsity pattern. Meanwhile, the extrinsic
information from the channel decoder will assist the joint channel estimation
and data detection. To reduce the complexity, a low-cost side information-aided
receiver is also proposed, where the channel decoder provides side information
to update the estimates on whether a user is active or not. Simulation results
show that the turbo receiver is able to reduce the activity detection, channel
estimation, and data decoding errors effectively, while the side
information-aided receiver notably outperforms the conventional method with a
relatively low complexity
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
We propose Conditional Adapter (CoDA), a parameter-efficient transfer
learning method that also improves inference efficiency. CoDA generalizes
beyond standard adapter approaches to enable a new way of balancing speed and
accuracy using conditional computation. Starting with an existing dense
pretrained model, CoDA adds sparse activation together with a small number of
new parameters and a light-weight training phase. Our experiments demonstrate
that the CoDA approach provides an unexpectedly efficient way to transfer
knowledge. Across a variety of language, vision, and speech tasks, CoDA
achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter
approach with moderate to no accuracy loss and the same parameter efficiency
Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures
Federated Recommender Systems (FedRecs) are considered privacy-preserving
techniques to collaboratively learn a recommendation model without sharing user
data. Since all participants can directly influence the systems by uploading
gradients, FedRecs are vulnerable to poisoning attacks of malicious clients.
However, most existing poisoning attacks on FedRecs are either based on some
prior knowledge or with less effectiveness. To reveal the real vulnerability of
FedRecs, in this paper, we present a new poisoning attack method to manipulate
target items' ranks and exposure rates effectively in the top-
recommendation without relying on any prior knowledge. Specifically, our attack
manipulates target items' exposure rate by a group of synthetic malicious users
who upload poisoned gradients considering target items' alternative products.
We conduct extensive experiments with two widely used FedRecs (Fed-NCF and
Fed-LightGCN) on two real-world recommendation datasets. The experimental
results show that our attack can significantly improve the exposure rate of
unpopular target items with extremely fewer malicious users and fewer global
epochs than state-of-the-art attacks. In addition to disclosing the security
hole, we design a novel countermeasure for poisoning attacks on FedRecs.
Specifically, we propose a hierarchical gradient clipping with sparsified
updating to defend against existing poisoning attacks. The empirical results
demonstrate that the proposed defending mechanism improves the robustness of
FedRecs.Comment: This paper has been accepted by SIGIR202
NF-Atlas: Multi-Volume Neural Feature Fields for Large Scale LiDAR Mapping
LiDAR Mapping has been a long-standing problem in robotics. Recent progress
in neural implicit representation has brought new opportunities to robotic
mapping. In this paper, we propose the multi-volume neural feature fields,
called NF-Atlas, which bridge the neural feature volumes with pose graph
optimization. By regarding the neural feature volume as pose graph nodes and
the relative pose between volumes as pose graph edges, the entire neural
feature field becomes both locally rigid and globally elastic. Locally, the
neural feature volume employs a sparse feature Octree and a small MLP to encode
the submap SDF with an option of semantics. Learning the map using this
structure allows for end-to-end solving of maximum a posteriori (MAP) based
probabilistic mapping. Globally, the map is built volume by volume
independently, avoiding catastrophic forgetting when mapping incrementally.
Furthermore, when a loop closure occurs, with the elastic pose graph based
representation, only updating the origin of neural volumes is required without
remapping. Finally, these functionalities of NF-Atlas are validated. Thanks to
the sparsity and the optimization based formulation, NF-Atlas shows competitive
performance in terms of accuracy, efficiency and memory usage on both
simulation and real-world datasets
ADS_UNet: A Nested UNet for Histopathology Image Segmentation
The UNet model consists of fully convolutional network (FCN) layers arranged
as contracting encoder and upsampling decoder maps. Nested arrangements of
these encoder and decoder maps give rise to extensions of the UNet model, such
as UNete and UNet++. Other refinements include constraining the outputs of the
convolutional layers to discriminate between segment labels when trained end to
end, a property called deep supervision. This reduces feature diversity in
these nested UNet models despite their large parameter space. Furthermore, for
texture segmentation, pixel correlations at multiple scales contribute to the
classification task; hence, explicit deep supervision of shallower layers is
likely to enhance performance. In this paper, we propose ADS UNet, a stage-wise
additive training algorithm that incorporates resource-efficient deep
supervision in shallower layers and takes performance-weighted combinations of
the sub-UNets to create the segmentation model. We provide empirical evidence
on three histopathology datasets to support the claim that the proposed ADS
UNet reduces correlations between constituent features and improves performance
while being more resource efficient. We demonstrate that ADS_UNet outperforms
state-of-the-art Transformer-based models by 1.08 and 0.6 points on CRAG and
BCSS datasets, and yet requires only 37% of GPU consumption and 34% of training
time as that required by Transformers.Comment: To be published in Expert Systems With Application
In-situ crack and keyhole pore detection in laser directed energy deposition through acoustic signal and deep learning
Cracks and keyhole pores are detrimental defects in alloys produced by laser
directed energy deposition (LDED). Laser-material interaction sound may hold
information about underlying complex physical events such as crack propagation
and pores formation. However, due to the noisy environment and intricate signal
content, acoustic-based monitoring in LDED has received little attention. This
paper proposes a novel acoustic-based in-situ defect detection strategy in
LDED. The key contribution of this study is to develop an in-situ acoustic
signal denoising, feature extraction, and sound classification pipeline that
incorporates convolutional neural networks (CNN) for online defect prediction.
Microscope images are used to identify locations of the cracks and keyhole
pores within a part. The defect locations are spatiotemporally registered with
acoustic signal. Various acoustic features corresponding to defect-free
regions, cracks, and keyhole pores are extracted and analysed in time-domain,
frequency-domain, and time-frequency representations. The CNN model is trained
to predict defect occurrences using the Mel-Frequency Cepstral Coefficients
(MFCCs) of the lasermaterial interaction sound. The CNN model is compared to
various classic machine learning models trained on the denoised acoustic
dataset and raw acoustic dataset. The validation results shows that the CNN
model trained on the denoised dataset outperforms others with the highest
overall accuracy (89%), keyhole pore prediction accuracy (93%), and AUC-ROC
score (98%). Furthermore, the trained CNN model can be deployed into an
in-house developed software platform for online quality monitoring. The
proposed strategy is the first study to use acoustic signals with deep learning
for insitu defect detection in LDED process.Comment: 36 Pages, 16 Figures, accepted at journal Additive Manufacturin
Advancing Model Pruning via Bi-level Optimization
The deployment constraints in practical applications necessitate the pruning
of large-scale deep learning models, i.e., promoting their weight sparsity. As
illustrated by the Lottery Ticket Hypothesis (LTH), pruning also has the
potential of improving their generalization ability. At the core of LTH,
iterative magnitude pruning (IMP) is the predominant pruning method to
successfully find 'winning tickets'. Yet, the computation cost of IMP grows
prohibitively as the targeted pruning ratio increases. To reduce the
computation overhead, various efficient 'one-shot' pruning methods have been
developed, but these schemes are usually unable to find winning tickets as good
as IMP. This raises the question of how to close the gap between pruning
accuracy and pruning efficiency? To tackle it, we pursue the algorithmic
advancement of model pruning. Specifically, we formulate the pruning problem
from a fresh and novel viewpoint, bi-level optimization (BLO). We show that the
BLO interpretation provides a technically-grounded optimization base for an
efficient implementation of the pruning-retraining learning paradigm used in
IMP. We also show that the proposed bi-level optimization-oriented pruning
method (termed BiP) is a special class of BLO problems with a bi-linear problem
structure. By leveraging such bi-linearity, we theoretically show that BiP can
be solved as easily as first-order optimization, thus inheriting the
computation efficiency. Through extensive experiments on both structured and
unstructured pruning with 5 model architectures and 4 data sets, we demonstrate
that BiP can find better winning tickets than IMP in most cases, and is
computationally as efficient as the one-shot pruning schemes, demonstrating 2-7
times speedup over IMP for the same level of model accuracy and sparsity.Comment: Thirty-sixth Conference on Neural Information Processing Systems
(NeurIPS 2022
Reinforcement Learning-based User-centric Handover Decision-making in 5G Vehicular Networks
The advancement of 5G technologies and Vehicular Networks open a new paradigm for Intelligent Transportation Systems (ITS) in safety and infotainment services in urban and highway scenarios. Connected vehicles are vital for enabling massive data sharing and supporting such services. Consequently, a stable connection is compulsory to transmit data across the network successfully. The new 5G technology introduces more bandwidth, stability, and reliability, but it faces a low communication range, suffering from more frequent handovers and connection drops. The shift from the base station-centric view to the user-centric view helps to cope with the smaller communication range and ultra-density of 5G networks. In this thesis, we propose a series of strategies to improve connection stability through efficient handover decision-making. First, a modified probabilistic approach, M-FiVH, aimed at reducing 5G handovers and enhancing network stability. Later, an adaptive learning approach employed Connectivity-oriented SARSA Reinforcement Learning (CO-SRL) for user-centric Virtual Cell (VC) management to enable efficient handover (HO) decisions. Following that, a user-centric Factor-distinct SARSA Reinforcement Learning (FD-SRL) approach combines time series data-oriented LSTM and adaptive SRL for VC and HO management by considering both historical and real-time data. The random direction of vehicular movement, high mobility, network load, uncertain road traffic situation, and signal strength from cellular transmission towers vary from time to time and cannot always be predicted. Our proposed approaches maintain stable connections by reducing the number of HOs by selecting the appropriate size of VCs and HO management. A series of improvements demonstrated through realistic simulations showed that M-FiVH, CO-SRL, and FD-SRL were successful in reducing the number of HOs and the average cumulative HO time. We provide an analysis and comparison of several approaches and demonstrate our proposed approaches perform better in terms of network connectivity
- …