206 research outputs found

    Improving Mobility and Safety in Traditional and Intelligent Transportation Systems Using Computational and Mathematical Modeling

    Get PDF
    In traditional transportation systems, park-and-ride (P&R) facilities have been introduced to mitigate the congestion problems and improve mobility. This study in the second chapter, develops a framework that integrates a demand model and an optimization model to study the optimal placement of P&R facilities. The results suggest that the optimal placement of P&R facilities has the potential to improve network performance, and reduce emission and vehicle kilometer traveled. In intelligent transportation systems, autonomous vehicles are expected to bring smart mobility to transportation systems, reduce traffic congestion, and improve safety of drivers and passengers by eliminating human errors. The safe operation of these vehicles highly depends on the data they receive from their external and on-board sensors. Autonomous vehicles like other cyber-physical systems are subject to cyberattacks and may be affected by faulty sensors. The consequent anomalous data can risk the safe operation of autonomous vehicles and may even lead to fatal crashes. Hence, in the third chapter, we develop an unsupervised/semi-supervised machine learning approach to address this gap. Specifically, this approach incorporates an additional autoencoder module into a generative adversarial network, which enables effective learning of the distribution of non-anomalous data. We term our approach GAN-enabled autoencoder for anomaly detection (GAAD). We evaluate the proposed approach using the Lyft Level 5 dataset and demonstrate its superior performance compared to state-of-the-art benchmarks. The prediction of a safe collision-free trajectory is probably the most important factor preventing the full adoption of autonomous vehicles in a public road. Despite recent advancements in motion prediction utilizing machine learning approaches for autonomous driving, the field is still in its early stages and necessitates further development of more effective methods to accurately estimate the future states of surrounding agents. Hence, in the fourth chapter, we introduce a novel deep learning approach for detecting the future trajectory of surrounding vehicles using a high-resolution semantic map and aerial imagery. Our proposed approach leverages integrated spatial and temporal learning to predict future motion. We assess the efficacy of our proposed approach on the Lyft Level 5 prediction dataset and achieve a comparable performance on various motion prediction metrics

    Synthetic Sensor Data for Human Activity Recognition

    Get PDF
    Human activity recognition (HAR) based on wearable sensors has emerged as an active topic of research in machine learning and human behavior analysis because of its applications in several fields, including health, security and surveillance, and remote monitoring. Machine learning algorithms are frequently applied in HAR systems to learn from labeled sensor data. The effectiveness of these algorithms generally relies on having access to lots of accurately labeled training data. But labeled data for HAR is hard to come by and is often heavily imbalanced in favor of one or other dominant classes, which in turn leads to poor recognition performance. In this study we introduce a generative adversarial network (GAN)-based approach for HAR that we use to automatically synthesize balanced and realistic sensor data. GANs are robust generative networks, typically used to create synthetic images that cannot be distinguished from real images. Here we explore and construct a model for generating several types of human activity sensor data using a Wasserstein GAN (WGAN). We assess the synthetic data using two commonly-used classifier models, Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). We evaluate the quality and diversity of the synthetic data by training on synthetic data and testing on real sensor data, and vice versa. We then use synthetic sensor data to oversample the imbalanced training set. We demonstrate the efficacy of the proposed method on two publicly available human activity datasets, the Sussex-Huawei Locomotion (SHL) and Smoking Activity Dataset (SAD). We achieve improvements of using WGAN augmented training data over the imbalanced case, for both SHL (0.85 to 0.95 F1-score), and for SAD (0.70 to 0.77 F1-score) when using a CNN activity classifier

    Adversarial Machine Learning Applied to Intrusion and Malware Scenarios: A Systematic Review

    Get PDF
    Cyber-security is the practice of protecting computing systems and networks from digital attacks, which are a rising concern in the Information Age. With the growing pace at which new attacks are developed, conventional signature based attack detection methods are often not enough, and machine learning poses as a potential solution. Adversarial machine learning is a research area that examines both the generation and detection of adversarial examples, which are inputs specially crafted to deceive classifiers, and has been extensively studied specifically in the area of image recognition, where minor modifications are performed on images that cause a classifier to produce incorrect predictions. However, in other fields, such as intrusion and malware detection, the exploration of such methods is still growing. The aim of this survey is to explore works that apply adversarial machine learning concepts to intrusion and malware detection scenarios. We concluded that a wide variety of attacks were tested and proven effective in malware and intrusion detection, although their practicality was not tested in intrusion scenarios. Adversarial defenses were substantially less explored, although their effectiveness was also proven at resisting adversarial attacks. We also concluded that, contrarily to malware scenarios, the variety of datasets in intrusion scenarios is still very small, with the most used dataset being greatly outdated

    Adversarial games in machine learning : challenges and applications

    Full text link
    L’apprentissage automatique repose pour un bon nombre de problèmes sur la minimisation d’une fonction de coût, pour ce faire il tire parti de la vaste littérature sur l’optimisation qui fournit des algorithmes et des garanties de convergences pour ce type de problèmes. Cependant récemment plusieurs modèles d’apprentissage automatique qui ne peuvent pas être formulé comme la minimisation d’un coût unique ont été propose, à la place ils nécessitent de définir un jeu entre plusieurs joueurs qui ont chaque leur propre objectif. Un de ces modèles sont les réseaux antagonistes génératifs (GANs). Ce modèle génératif formule un jeu entre deux réseaux de neurones, un générateur et un discriminateur, en essayant de tromper le discriminateur qui essaye de distinguer les vraies images des fausses, le générateur et le discriminateur s’améliore résultant en un équilibre de Nash, ou les images produites par le générateur sont indistinguable des vraies images. Malgré leur succès les GANs restent difficiles à entrainer à cause de la nature antagoniste du jeu, nécessitant de choisir les bons hyperparamètres et résultant souvent en une dynamique d’entrainement instable. Plusieurs techniques de régularisations ont été propose afin de stabiliser l’entrainement, dans cette thèse nous abordons ces instabilités sous l’angle d’un problème d’optimisation. Nous commençons par combler le fossé entre la littérature d’optimisation et les GANs, pour ce faire nous formulons GANs comme un problème d’inéquation variationnelle, et proposons de la littérature sur le sujet pour proposer des algorithmes qui convergent plus rapidement. Afin de mieux comprendre quels sont les défis de l’optimisation des jeux, nous proposons plusieurs outils afin d’analyser le paysage d’optimisation des GANs. En utilisant ces outils, nous montrons que des composantes rotationnelles sont présentes dans le voisinage des équilibres, nous observons également que les GANs convergent rarement vers un équilibre de Nash mais converge plutôt vers des équilibres stables locaux (LSSP). Inspirer par le succès des GANs nous proposons pour finir, une nouvelle famille de jeux que nous appelons adversarial example games qui consiste à entrainer simultanément un générateur et un critique, le générateur cherchant à perturber les exemples afin d’induire en erreur le critique, le critique cherchant à être robuste aux perturbations. Nous montrons qu’à l’équilibre de ce jeu, le générateur est capable de générer des perturbations qui transfèrent à toute une famille de modèles.Many machine learning (ML) problems can be formulated as minimization problems, with a large optimization literature that provides algorithms and guarantees to solve this type of problems. However, recently some ML problems have been proposed that cannot be formulated as minimization problems but instead require to define a game between several players where each player has a different objective. A successful application of such games in ML are generative adversarial networks (GANs), where generative modeling is formulated as a game between a generator and a discriminator, where the goal of the generator is to fool the discriminator, while the discriminator tries to distinguish between fake and real samples. However due to the adversarial nature of the game, GANs are notoriously hard to train, requiring careful fine-tuning of the hyper-parameters and leading to unstable training. While regularization techniques have been proposed to stabilize training, we propose in this thesis to look at these instabilities from an optimization perspective. We start by bridging the gap between the machine learning and optimization literature by casting GANs as an instance of the Variational Inequality Problem (VIP), and leverage the large literature on VIP to derive more efficient and stable algorithms to train GANs. To better understand what are the challenges of training GANs, we then propose tools to study the optimization landscape of GANs. Using these tools we show that GANs do suffer from rotation around their equilibrium, and that they do not converge to Nash-Equilibria. Finally inspired by the success of GANs to generate images, we propose a new type of games called Adversarial Example Games that are able to generate adversarial examples that transfer across different models and architectures

    Differentially private synthetic tabular data generation with a generative adversarial network and privacy amplification by subsampling

    Get PDF
    Advances in computation have created high demand for large datasets, which in turn has sparked interest in using personal data collected by different institutions for secondary purposes such as research. However, in many domains like healthcare, privacy concerns often stand in the way of sharing data for novel use. One promising approach to making data anonymous in order to make privacy-preserving data sharing possible is to create what is called synthetic data. Synthetic data is based on real data and attempts to mimic the properties of that real data while preserving utility for different tasks and protecting the privacy of those depicted in the original dataset. In this work, the differential privacy (DP) framework is adopted to train a generative adversarial network (GAN) in a privacy-preserving manner to create differentially private synthetic tabular data. The quality of the synthetic data is evaluated based on its usefulness for training new models on it, the extent to which realistic sample quality is retained and the strength of privacy guarantees achieved. The technical implementation modifies the state-of-the-art DP GAN model, the GS-WGAN by Chen, Orekondy, and Fritz from the domain of images to that of tabular data. This model choice poses novel questions on whether similar privacy benefits as reported with image data can be achieved with tabular data by applying the privacy by subsampling technique to the GAN training process. The technical choices in this work also focus on theoretical synergies between the model architecture and privacy-preserving training as well as the method's usability in a real-life scenario. The results show that the synthetic data generated preserves utility in training downstream classification models while attaining strong privacy guarantees. However, simultaneously retaining realistic sample quality proved to be difficult. The research presented in this thesis contributes to the field of differentially private synthetic data generation with GAN models by demonstrating, that the application of PABS to GAN training is an effective way to achieve stronger privacy guarantees with tabular data. The results raise important questions over whether the use of downstream classification accuracy as a metric can lead to synthetic data biased towards this specific task and whether DP synthetic data should be separately crafted for different tasks to avoid loss of utility
    • …
    corecore