456 research outputs found

    Active Sampling-based Binary Verification of Dynamical Systems

    Full text link
    Nonlinear, adaptive, or otherwise complex control techniques are increasingly relied upon to ensure the safety of systems operating in uncertain environments. However, the nonlinearity of the resulting closed-loop system complicates verification that the system does in fact satisfy those requirements at all possible operating conditions. While analytical proof-based techniques and finite abstractions can be used to provably verify the closed-loop system's response at different operating conditions, they often produce conservative approximations due to restrictive assumptions and are difficult to construct in many applications. In contrast, popular statistical verification techniques relax the restrictions and instead rely upon simulations to construct statistical or probabilistic guarantees. This work presents a data-driven statistical verification procedure that instead constructs statistical learning models from simulated training data to separate the set of possible perturbations into "safe" and "unsafe" subsets. Binary evaluations of closed-loop system requirement satisfaction at various realizations of the uncertainties are obtained through temporal logic robustness metrics, which are then used to construct predictive models of requirement satisfaction over the full set of possible uncertainties. As the accuracy of these predictive statistical models is inherently coupled to the quality of the training data, an active learning algorithm selects additional sample points in order to maximize the expected change in the data-driven model and thus, indirectly, minimize the prediction error. Various case studies demonstrate the closed-loop verification procedure and highlight improvements in prediction error over both existing analytical and statistical verification techniques.Comment: 23 page

    Deciding and verifying network properties locally with few output bits

    Get PDF
    International audienceGiven a boolean predicate on labeled networks (e.g., the network is acyclic, or the network is properly colored, etc.), deciding in a distributed manner whether a given labeled network satisfies that predicate typically consists, in the standard setting, of every node inspecting its close neighborhood, and outputting a boolean verdict, such that the network satisfies the predicate if and only if all nodes output true. In this paper, we investigate a more general notion of distributed decision in which every node is allowed to output a constant number b≥1b\geq 1 of bits, which are gathered by a central authority emitting a global boolean verdict based on these outputs, such that the network satisfies the predicate if and only if this global verdict equals true. We analyze the power and limitations of this extended notion of distributed decision

    A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability

    Full text link
    Graph Neural Networks (GNNs) have made rapid developments in the recent years. Due to their great ability in modeling graph-structured data, GNNs are vastly used in various applications, including high-stakes scenarios such as financial analysis, traffic predictions, and drug discovery. Despite their great potential in benefiting humans in the real world, recent study shows that GNNs can leak private information, are vulnerable to adversarial attacks, can inherit and magnify societal bias from training data and lack interpretability, which have risk of causing unintentional harm to the users and society. For example, existing works demonstrate that attackers can fool the GNNs to give the outcome they desire with unnoticeable perturbation on training graph. GNNs trained on social networks may embed the discrimination in their decision process, strengthening the undesirable societal bias. Consequently, trustworthy GNNs in various aspects are emerging to prevent the harm from GNN models and increase the users' trust in GNNs. In this paper, we give a comprehensive survey of GNNs in the computational aspects of privacy, robustness, fairness, and explainability. For each aspect, we give the taxonomy of the related methods and formulate the general frameworks for the multiple categories of trustworthy GNNs. We also discuss the future research directions of each aspect and connections between these aspects to help achieve trustworthiness

    Automated Machine Learning implementation framework in the banking sector

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsAutomated Machine Learning is a subject in the Machine Learning field, designed to give the possibility of Machine Learning use to non-expert users, it aroused from the lack of subject matter experts, trying to remove humans from these topic implementations. The advantages behind automated machine learning are leaning towards the removal of human implementation, fastening the machine learning deployment speed. The organizations will benefit from effective solutions benchmarking and validations. The use of an automated machine learning implementation framework can deeply transform an organization adding value to the business by freeing the subject matter experts of the low-level machine learning projects, letting them focus on high level projects. This will also help the organization reach new competence, customization, and decision-making levels in a higher analytical maturity level. This work pretends, firstly to investigate the impact and benefits automated machine learning implementation in the banking sector, and afterwards develop an implementation framework that could be used by banking institutions as a guideline for the automated machine learning implementation through their departments. The autoML advantages and benefits are evaluated regarding business value and competitive advantage and it is presented the implementation in a fictitious institution, considering all the need steps and the possible setbacks that could arise. Banking institutions, in their business have different business processes, and since most of them are old institutions, the main concerns are related with the automating their business process, improving their analytical maturity and sensibilizing their workforce to the benefits of the implementation of new forms of work. To proceed to a successful implementation plan should be known the institution particularities, adapt to them and ensured the sensibilization of the workforce and management to the investments that need to be made and the changes in all levels of their organizational work that will come from that, that will lead to a lot of facilities in everyone’s daily work

    Local Certification of Graph Decompositions and Applications to Minor-Free Classes

    Get PDF
    Local certification consists in assigning labels to the nodes of a network to certify that some given property is satisfied, in such a way that the labels can be checked locally. In the last few years, certification of graph classes received a considerable attention. The goal is to certify that a graph G belongs to a given graph class ?. Such certifications with labels of size O(log n) (where n is the size of the network) exist for trees, planar graphs and graphs embedded on surfaces. Feuilloley et al. ask if this can be extended to any class of graphs defined by a finite set of forbidden minors. In this work, we develop new decomposition tools for graph certification, and apply them to show that for every small enough minor H, H-minor-free graphs can indeed be certified with labels of size O(log n). We also show matching lower bounds using a new proof technique

    Multi-Objective Optimisation for Tuning Building Heating and Cooling Loads Forecasting Models

    Get PDF
    Machine learning (ML) has been recognised as a powerful method for modelling building energy consumption. The capability of ML to provide a fast and accurate prediction of energy loads makes it an ideal tool for decision-making tasks related to sustainable design and retrofit planning. However, the accuracy of these ML models is much dependant on the selection of the right hyper-parameters for specific building dataset. This paper proposes a method for optimising ML model for forecasting both heating and cooling loads. The technique employs multi-objective optimisation with evolutionary algorithms to search the space of possible parameters. The proposed approach not only tune one model to precisely predict building energy loads but also accelerates the process of model optimisation. The study utilises a simulated building energy data generated in EnergyPlus to demonstrate the efficiency of the proposed method, and compares the outcomes with the regular ML tuning procedure (i.e. grid search). The optimised model provides a reliable tool for building designers and engineers to explore a large space of the available building materials and technologies

    Deep Learning-Based Intrusion Detection Methods for Computer Networks and Privacy-Preserving Authentication Method for Vehicular Ad Hoc Networks

    Get PDF
    The incidence of computer network intrusions has significantly increased over the last decade, partially attributed to a thriving underground cyber-crime economy and the widespread availability of advanced tools for launching such attacks. To counter these attacks, researchers in both academia and industry have turned to machine learning (ML) techniques to develop Intrusion Detection Systems (IDSes) for computer networks. However, many of the datasets use to train ML classifiers for detecting intrusions are not balanced, with some classes having fewer samples than others. This can result in ML classifiers producing suboptimal results. In this dissertation, we address this issue and present better ML based solutions for intrusion detection. Our contributions in this direction can be summarized as follows: Balancing Data Using Synthetic Data to detect intrusions in Computer Networks: In the past, researchers addressed the issue of imbalanced data in datasets by using over-sampling and under-sampling techniques. In this study, we go beyond such traditional methods and utilize a synthetic data generation method called Con- ditional Generative Adversarial Network (CTGAN) to balance the datasets and in- vestigate its impact on the performance of widely used ML classifiers. To the best of our knowledge, no one else has used CTGAN to generate synthetic samples for balancing intrusion detection datasets. We use two widely used publicly available datasets and conduct extensive experiments and show that ML classifiers trained on these datasets balanced with synthetic samples generated by CTGAN have higher prediction accuracy and Matthew Correlation Coefficient (MCC) scores than those trained on imbalanced datasets by 8% and 13%, respectively. Deep Learning approach for intrusion detection using focal loss function: To overcome the data imbalance problem for intrusion detection, we leverage the specialized loss function, called focal loss, that automatically down-weighs easy ex- amples and focuses on the hard negatives by facilitating dynamically scaled-gradient updates for training ML models effectively. We implement our approach using two well-known Deep Learning (DL) neural network architectures. Compared to training DL models using cross-entropy loss function, our approach (training DL models using focal loss function) improved accuracy, precision, F1 score, and MCC score by 24%, 39%, 39%, and 60% respectively. Efficient Deep Learning approach to detect Intrusions using Few-shot Learning: To address the issue of imbalance the datasets and develop a highly effective IDS, we utilize the concept of few-shot learning. We present a Few-Shot and Self-Supervised learning framework, called FS3, for detecting intrusions in IoT networks. FS3 works in three phases. Our approach involves first pretraining an encoder on a large-scale external dataset in a selfsupervised manner. We then employ few-shot learning (FSL), which seeks to replicate the encoder’s ability to learn new patterns from only a few training examples. During the encoder training us- ing a small number of samples, we train them contrastively, utilizing the triplet loss function. The third phase introduces a novel K-Nearest neighbor algorithm that sub- samples the majority class instances to further reduce imbalance and improve overall performance. Our proposed framework FS3, utilizing only 20% of labeled data, out- performs fully supervised state-of-the-art models by up to 42.39% and 43.95% with respect to the metrics precision and F1 score, respectively. The rapid evolution of the automotive industry and advancements in wireless com- munication technologies will result in the widespread deployment of Vehicular ad hoc networks (VANETs). However, despite the network’s potential to enable intelligent and autonomous driving, it also introduces various attack vectors that can jeopardize its security. In this dissertation, we present efficient privacy-preserving authenticated message dissemination scheme in VANETs. Conditional Privacy-preserving Authentication and Message Dissemination Scheme using Timestamp based Pseudonyms: To authenticate a message sent by a vehicle using its pseudonym, a certificate of the pseudonym signed by the central authority is generally utilized. If a vehicle is found to be malicious, certificates associated with all the pseudonyms assigned to it must be revoked. Certificate revocation lists (CRLs) should be shared with all entities that will be corresponding with the vehicle. As each vehicle has a large pool of pseudonyms allocated to it, the CRL can quickly grow in size as the number of revoked vehicles increases. This results in high storage overheads for storing the CRL, and significant authentication overheads as the receivers must check their CRL for each message received to verify its pseudonym. To address this issue, we present a timestamp-based pseudonym allocation scheme that reduces the storage overhead and authentication overhead by streamlining the CRL management process
    • …
    corecore