659 research outputs found

    Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey

    Full text link
    Adversarial attacks and defenses in machine learning and deep neural network have been gaining significant attention due to the rapidly growing applications of deep learning in the Internet and relevant scenarios. This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques, with a focus on deep neural network-based classification models. Specifically, we conduct a comprehensive classification of recent adversarial attack methods and state-of-the-art adversarial defense techniques based on attack principles, and present them in visually appealing tables and tree diagrams. This is based on a rigorous evaluation of the existing works, including an analysis of their strengths and limitations. We also categorize the methods into counter-attack detection and robustness enhancement, with a specific focus on regularization-based methods for enhancing robustness. New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks, and a hierarchical classification of the latest defense methods is provided, highlighting the challenges of balancing training costs with performance, maintaining clean accuracy, overcoming the effect of gradient masking, and ensuring method transferability. At last, the lessons learned and open challenges are summarized with future research opportunities recommended.Comment: 46 pages, 21 figure

    RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

    Full text link
    Exploration in sparse reward environments remains one of the key challenges of model-free reinforcement learning. Instead of solely relying on extrinsic rewards provided by the environment, many state-of-the-art methods use intrinsic rewards to encourage exploration. However, we show that existing methods fall short in procedurally-generated environments where an agent is unlikely to visit a state more than once. We propose a novel type of intrinsic reward which encourages the agent to take actions that lead to significant changes in its learned state representation. We evaluate our method on multiple challenging procedurally-generated tasks in MiniGrid, as well as on tasks with high-dimensional observations used in prior work. Our experiments demonstrate that this approach is more sample efficient than existing exploration methods, particularly for procedurally-generated MiniGrid environments. Furthermore, we analyze the learned behavior as well as the intrinsic reward received by our agent. In contrast to previous approaches, our intrinsic reward does not diminish during the course of training and it rewards the agent substantially more for interacting with objects that it can control

    Learning about the learning process: from active querying to fine-tuning

    Get PDF
    The majority of research on academic machine learning addresses the core model fitting part of the machine learning workflow. However, prior to model fitting, data collection and annotation is an important step; and subsequently to this, knowledge transfer to different but related problems is also important. Recently, the core model fitting step in this workflow has been upgraded using learning-to-learn methodologies, where learning algorithms are applied to improve the fitting algorithm itself in terms of computation or data efficiency. However, algorithms for data collection and knowledge transfer are still commonly hand-engineered. In this doctoral thesis, we upgrade the pre-and post-processing steps of the machine learning pipeline with the learningto- learn paradigm. We first present novel learning-to-learn approaches that improve the algorithms for this pre-processing step in terms of label efficiency. The inefficiency of data annotation is a common issue in the field: To fit the desired model, a large amount of data is usually collected and annotated, much of which is useless. Active learning aims to address this by selecting the most suitable data for annotation. Since conventional active learning algorithms are hand-engineered and heuristically designed for a specific problem, they typically cannot be adapted across nor even within datasets. The data efficiency of active learning can be improved either by online learning active learning within a specific problem, or by transferring active learning knowledge between related problems. We begin by investigating the framework of leaning active learning online, which learns to select the best criteria for a particular dataset as queries are made. It enables online adaptation, along with the state of the model and dataset changes, while guaranteeing performance. Subsequently, we upgrade the previous framework to a data-driven learning-based approach by learning a transferable active-learning policy end-to-end. The framework is thus capable of directly optimising the accuracy of the underlying classifier, and can adapt to the statistics of any given dataset. More importantly, the learned active-learning policy is domain agnostic and generalises to new learning problems. We next turn to knowledge transfer from a well-learned problem to a novel target problem. We develop a new learning-to-learn technique to improve the effectiveness and efficiency of fine-tuning-based transfer learning. Conventional transfer learning approaches are heuristic: Most commonly, small learning-rate stochastic gradient descent starting from the source model as a condition, and keeping the architecture constant. However, the typical transfer learning pipeline transfers learning from a general model or dataset to a more specific one. Thus, we propose a transfer learning algorithm for neural networks, which simultaneously prune the size of the target networks architecture and updates its weights. This enables the model complexity to be reduced, as training iterations increase, and both efficiency and efficacy are improved compared to conventional fine-tuning knowledge transfer

    Deep Learning Techniques for Electroencephalography Analysis

    Get PDF
    In this thesis we design deep learning techniques for training deep neural networks on electroencephalography (EEG) data and in particular on two problems, namely EEG-based motor imagery decoding and EEG-based affect recognition, addressing challenges associated with them. Regarding the problem of motor imagery (MI) decoding, we first consider the various kinds of domain shifts in the EEG signals, caused by inter-individual differences (e.g. brain anatomy, personality and cognitive profile). These domain shifts render multi-subject training a challenging task and impede robust cross-subject generalization. We build a two-stage model ensemble architecture and propose two objectives to train it, combining the strengths of curriculum learning and collaborative training. Our subject-independent experiments on the large datasets of Physionet and OpenBMI, verify the effectiveness of our approach. Next, we explore the utilization of the spatial covariance of EEG signals through alignment techniques, with the goal of learning domain-invariant representations. We introduce a Riemannian framework that concurrently performs covariance-based signal alignment and data augmentation, while training a convolutional neural network (CNN) on EEG time-series. Experiments on the BCI IV-2a dataset show that our method performs superiorly over traditional alignment, by inducing regularization to the weights of the CNN. We also study the problem of EEG-based affect recognition, inspired by works suggesting that emotions can be expressed in relative terms, i.e. through ordinal comparisons between different affective state levels. We propose treating data samples in a pairwise manner to infer the ordinal relation between their corresponding affective state labels, as an auxiliary training objective. We incorporate our objective in a deep network architecture which we jointly train on the tasks of sample-wise classification and pairwise ordinal ranking. We evaluate our method on the affective datasets of DEAP and SEED and obtain performance improvements over deep networks trained without the additional ranking objective

    Federated Domain Generalization: A Survey

    Full text link
    Machine learning typically relies on the assumption that training and testing distributions are identical and that data is centrally stored for training and testing. However, in real-world scenarios, distributions may differ significantly and data is often distributed across different devices, organizations, or edge nodes. Consequently, it is imperative to develop models that can effectively generalize to unseen distributions where data is distributed across different domains. In response to this challenge, there has been a surge of interest in federated domain generalization (FDG) in recent years. FDG combines the strengths of federated learning (FL) and domain generalization (DG) techniques to enable multiple source domains to collaboratively learn a model capable of directly generalizing to unseen domains while preserving data privacy. However, generalizing the federated model under domain shifts is a technically challenging problem that has received scant attention in the research area so far. This paper presents the first survey of recent advances in this area. Initially, we discuss the development process from traditional machine learning to domain adaptation and domain generalization, leading to FDG as well as provide the corresponding formal definition. Then, we categorize recent methodologies into four classes: federated domain alignment, data manipulation, learning strategies, and aggregation optimization, and present suitable algorithms in detail for each category. Next, we introduce commonly used datasets, applications, evaluations, and benchmarks. Finally, we conclude this survey by providing some potential research topics for the future

    Uncertainty Estimation, Explanation and Reduction with Insufficient Data

    Full text link
    Human beings have been juggling making smart decisions under uncertainties, where we manage to trade off between swift actions and collecting sufficient evidence. It is naturally expected that a generalized artificial intelligence (GAI) to navigate through uncertainties meanwhile predicting precisely. In this thesis, we aim to propose strategies that underpin machine learning with uncertainties from three perspectives: uncertainty estimation, explanation and reduction. Estimation quantifies the variability in the model inputs and outputs. It can endow us to evaluate the model predictive confidence. Explanation provides a tool to interpret the mechanism of uncertainties and to pinpoint the potentials for uncertainty reduction, which focuses on stabilizing model training, especially when the data is insufficient. We hope that this thesis can motivate related studies on quantifying predictive uncertainties in deep learning. It also aims to raise awareness for other stakeholders in the fields of smart transportation and automated medical diagnosis where data insufficiency induces high uncertainty. The thesis is dissected into the following sections: Introduction. we justify the necessity to investigate AI uncertainties and clarify the challenges existed in the latest studies, followed by our research objective. Literature review. We break down the the review of the state-of-the-art methods into uncertainty estimation, explanation and reduction. We make comparisons with the related fields encompassing meta learning, anomaly detection, continual learning as well. Uncertainty estimation. We introduce a variational framework, neural process that approximates Gaussian processes to handle uncertainty estimation. Two variants from the neural process families are proposed to enhance neural processes with scalability and continual learning. Uncertainty explanation. We inspect the functional distribution of neural processes to discover the global and local factors that affect the degree of predictive uncertainties. Uncertainty reduction. We validate the proposed uncertainty framework on two scenarios: urban irregular behaviour detection and neurological disorder diagnosis, where the intrinsic data insufficiency undermines the performance of existing deep learning models. Conclusion. We provide promising directions for future works and conclude the thesis

    Deep Learning for Face Anti-Spoofing: A Survey

    Full text link
    Face anti-spoofing (FAS) has lately attracted increasing attention due to its vital role in securing face recognition systems from presentation attacks (PAs). As more and more realistic PAs with novel types spring up, traditional FAS methods based on handcrafted features become unreliable due to their limited representation capacity. With the emergence of large-scale academic datasets in the recent decade, deep learning based FAS achieves remarkable performance and dominates this area. However, existing reviews in this field mainly focus on the handcrafted features, which are outdated and uninspiring for the progress of FAS community. In this paper, to stimulate future research, we present the first comprehensive review of recent advances in deep learning based FAS. It covers several novel and insightful components: 1) besides supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also investigate recent methods with pixel-wise supervision (e.g., pseudo depth map); 2) in addition to traditional intra-dataset evaluation, we collect and analyze the latest methods specially designed for domain generalization and open-set FAS; and 3) besides commercial RGB camera, we summarize the deep learning applications under multi-modal (e.g., depth and infrared) or specialized (e.g., light field and flash) sensors. We conclude this survey by emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    Deep Learning Techniques for Mobility Prediction and Management in Mobile Networks

    Get PDF
    Trajectory prediction is an important research topic in modern mobile networks (e.g., 5G and beyond 5G) to enhance the network quality of service by accurately predicting the future locations of mobile users, such as pedestrians and vehicles, based on their past mobility patterns. A trajectory is defined as the sequence of locations the user visits over time. The primary objective of this thesis is to improve the modeling of mobility data and establish personalized, scalable, collective-intelligent, distributed, and strategic trajectory prediction techniques that can effectively adapt to the dynamics of urban environments in order to facilitate the optimal delivery of mobility-aware network services. Our proposed approaches aim to increase the accuracy of trajectory prediction while minimizing communication and computational costs leading to more efficient mobile networks. The thesis begins by introducing a personalized trajectory prediction technique using deep learning and reinforcement learning. It adapts the neural network architecture to capture the distinct characteristics of mobile users’ data. Furthermore, it introduces advanced anticipatory handover management and dynamic service migration techniques that optimize network management using our high-performance trajectory predictor. This approach ensures seamless connectivity and proactively migrates network services, enhancing the quality of service in dense wireless networks. The second contribution of the thesis introduces cluster-level prediction to extend the reinforcement learning-based trajectory prediction, addressing scalability challenges in large-scale networks. Cluster-level trajectory prediction leverages users’ similarities within clusters to train only a few representatives. This enables efficient transfer learning of pre-trained mobility models and reduces computational overhead enhancing the network scalability. The third contribution proposes a collaborative social-aware multi-agent trajectory prediction technique that accounts for the interactions between multiple intra-cluster agents in a dynamic urban environment, increasing the prediction accuracy but decreasing the algorithm complexity and computational resource usage. The fourth contribution proposes a federated learning-driven multi-agent trajectory prediction technique that leverages the collaborative power of multiple local data sources in a decentralized manner to enhance user privacy and improve the accuracy of trajectory prediction while jointly minimizing computational and communication costs. The fifth contribution proposes a game theoretic non-cooperative multi-agent prediction technique that considers the strategic behaviors among competitive inter-cluster mobile users. The proposed approaches are evaluated on small-scale and large-scale location-based mobility datasets, where locations could be GPS coordinates or cellular base station IDs. Our experiments demonstrate that our proposed approaches outperform state-of-the-art trajectory prediction methods making significant contributions to the field of mobile networks
    • 

    corecore