7 research outputs found

    Trust-Based Cloud Machine Learning Model Selection For Industrial IoT and Smart City Services

    Get PDF
    With Machine Learning (ML) services now used in a number of mission-critical human-facing domains, ensuring the integrity and trustworthiness of ML models becomes all-important. In this work, we consider the paradigm where cloud service providers collect big data from resource-constrained devices for building ML-based prediction models that are then sent back to be run locally on the intermittently-connected resource-constrained devices. Our proposed solution comprises an intelligent polynomial-time heuristic that maximizes the level of trust of ML models by selecting and switching between a subset of the ML models from a superset of models in order to maximize the trustworthiness while respecting the given reconfiguration budget/rate and reducing the cloud communication overhead. We evaluate the performance of our proposed heuristic using two case studies. First, we consider Industrial IoT (IIoT) services, and as a proxy for this setting, we use the turbofan engine degradation simulation dataset to predict the remaining useful life of an engine. Our results in this setting show that the trust level of the selected models is 0.49% to 3.17% less compared to the results obtained using Integer Linear Programming (ILP). Second, we consider Smart Cities services, and as a proxy of this setting, we use an experimental transportation dataset to predict the number of cars. Our results show that the selected model's trust level is 0.7% to 2.53% less compared to the results obtained using ILP. We also show that our proposed heuristic achieves an optimal competitive ratio in a polynomial-time approximation scheme for the problem

    A Superficial Analysis Approach for Identifying Malicious Domain Names Generated by DGA Malware

    Get PDF
    Some of the most serious security threats facing computer networks involve malware. To prevent malware-related damage, administrators must swiftly identify and remove the infected machines that may reside in their networks. However, many malware families have domain generation algorithms (DGAs) to avoid detection. A DGA is a technique in which the domain name is changed frequently to hide the callback communication from the infected machine to the command-and-control server. In this article, we propose an approach for estimating the randomness of domain names by superficially analyzing their character strings. This approach is based on the following observations: human-generated benign domain names tend to reflect the intent of their domain registrants, such as an organization, product, or content. In contrast, dynamically generated malicious domain names consist of meaningless character strings because conflicts with already registered domain names must be avoided; hence, there are discernible differences in the strings of dynamically generated and human-generated domain names. Notably, our approach does not require any prior knowledge about DGAs. Our evaluation indicates that the proposed approach is capable of achieving recall and precision as high as 0.9960 and 0.9029, respectively, when used with labeled datasets. Additionally, this approach has proven to be highly effective for datasets collected via a campus network. Thus, these results suggest that malware-infected machines can be swiftly identified and removed from networks using DNS queries for detected malicious domains as triggers

    Benchmark-Based Reference Model for Evaluating Botnet Detection Tools Driven by Traffic-Flow Analytics

    Get PDF
    Botnets are some of the most recurrent cyber-threats, which take advantage of the wide heterogeneity of endpoint devices at the Edge of the emerging communication environments for enabling the malicious enforcement of fraud and other adversarial tactics, including malware, data leaks or denial of service. There have been significant research advances in the development of accurate botnet detection methods underpinned on supervised analysis but assessing the accuracy and performance of such detection methods requires a clear evaluation model in the pursuit of enforcing proper defensive strategies. In order to contribute to the mitigation of botnets, this paper introduces a novel evaluation scheme grounded on supervised machine learning algorithms that enable the detection and discrimination of different botnets families on real operational environments. The proposal relies on observing, understanding and inferring the behavior of each botnet family based on network indicators measured at flow-level. The assumed evaluation methodology contemplates six phases that allow building a detection model against botnet-related malware distributed through the network, for which five supervised classifiers were instantiated were instantiated for further comparisons—Decision Tree, Random Forest, Naive Bayes Gaussian, Support Vector Machine and K-Neighbors. The experimental validation was performed on two public datasets of real botnet traffic—CIC-AWS-2018 and ISOT HTTP Botnet. Bearing the heterogeneity of the datasets, optimizing the analysis with the Grid Search algorithm led to improve the classification results of the instantiated algorithms. An exhaustive evaluation was carried out demonstrating the adequateness of our proposal which prompted that Random Forest and Decision Tree models are the most suitable for detecting different botnet specimens among the chosen algorithms. They exhibited higher precision rates whilst analyzing a large number of samples with less processing time. The variety of testing scenarios were deeply assessed and reported to set baseline results for future benchmark analysis targeted on flow-based behavioral patterns

    Game-Theoretic Foundations for Forming Trusted Coalitions of Multi-Cloud Services in the Presence of Active and Passive Attacks

    Get PDF
    The prominence of cloud computing as a common paradigm for offering Web-based services has led to an unprecedented proliferation in the number of services that are deployed in cloud data centers. In parallel, services' communities and cloud federations have gained an increasing interest in the recent past years due to their ability to facilitate the discovery, composition, and resource scaling issues in large-scale services' markets. The problem is that the existing community and federation formation solutions deal with services as traditional software systems and overlook the fact that these services are often being offered as part of the cloud computing technology, which poses additional challenges at the architectural, business, and security levels. The motivation of this thesis stems from four main observations/research gaps that we have drawn through our literature reviews and/or experiments, which are: (1) leading cloud services such as Google and Amazon do not have incentives to group themselves into communities/federations using the existing community/federation formation solutions; (2) it is quite difficult to find a central entity that can manage the community/federation formation process in a multi-cloud environment; (3) if we allow services to rationally select their communities/federations without considering their trust relationships, these services might have incentives to structure themselves into communities/federations consisting of a large number of malicious services; and (4) the existing intrusion detection solutions in the domain of cloud computing are still ineffective in capturing advanced multi-type distributed attacks initiated by communities/federations of attackers since they overlook the attacker's strategies in their design and ignore the cloud system's resource constraints. This thesis aims to address these gaps by (1) proposing a business-oriented community formation model that accounts for the business potential of the services in the formation process to motivate the participation of services of all business capabilities, (2) introducing an inter-cloud trust framework that allows services deployed in one or disparate cloud centers to build credible trust relationships toward each other, while overcoming the collusion attacks that occur to mislead trust results even in extreme cases wherein attackers form the majority, (3) designing a trust-based game theoretical model that enables services to distributively form trustworthy multi-cloud communities wherein the number of malicious services is minimal, (4) proposing an intra-cloud trust framework that allows the cloud system to build credible trust relationships toward the guest Virtual Machines (VMs) running cloud-based services using objective and subjective trust sources, (5) designing and solving a trust-based maxmin game theoretical model that allows the cloud system to optimally distribute the detection load among VMs within a limited budget of resources, while considering Distributed Denial of Service (DDoS) attacks as a practical scenario, and (6) putting forward a resource-aware comprehensive detection and prevention system that is able to capture and prevent advanced simultaneous multi-type attacks within a limited amount of resources. We conclude the thesis by uncovering some persisting research gaps that need further study and investigation in the future

    Theoretical and Applied Foundations for Intrusion Detection in Single and Federated Clouds

    Get PDF
    Les systèmes infonuagiques deviennent de plus en plus complexes, plus dynamiques et hétérogènes. Un tel environnement produit souvent des données complexes et bruitées, empêchant les systèmes de détection d’intrusion (IDS) de détecter des variantes d’attaques connues. Une seule intrusion ou une attaque dans un tel système hétérogène peut se présenter sous des formes différentes, logiquement mais non synthétiquement similaires. Les IDS traditionnels sont incapables d’identifier ces attaques, car ils sont conçus pour des infrastructures spécifiques et limitées. Par conséquent, une détection précise dans le nuage ne sera absolument pas identifiée. Outre le problème de l’infonuagique, les cyber-attaques sont de plus en plus sophistiquées et difficiles à détecter. Il est donc extrêmement compliqué pour un unique IDS d’un nuage de détecter toutes les attaques, en raison de leurs implications, et leurs connaissances limitées et insuffisantes de celles-ci. Les solutions IDS actuelles de l’infonuagique résident dans le fait qu’elles ne tiennent pas compte des aspects dynamiques et hétérogènes de l’infonuagique. En outre, elles s’appuient fondamentalement sur les connaissances et l’expérience locales pour identifier les attaques et les modèles existants. Cela rend le nuage vulnérable aux attaques «Zero-Day». À cette fin, nous résolvons dans cette thèse deux défis associés à l’IDS de l’infonuagique : la détection des cyberattaques dans des environnements complexes, dynamiques et hétérogènes, et la détection des cyberattaques ayant des informations limitées et/ou incomplètes sur les intrusions et leurs conséquences. Dans cette thèse, nous sommes intéressés aux IDS génériques de l’infonuagique afin d’identifier les intrusions qui sont indépendantes de l’infrastructure utilisée. Par conséquent, à chaque fois qu’un pressentiment d’attaque est identifié, le système de détection d’intrusion doit être capable de reconnaître toutes les variantes d’une telle attaque, quelle que soit l’infrastructure utilisée. De plus, les IDS de l’infonuagique coopèrent et échangent des informations afin de faire bénéficier chacun des expertises des autres, pour identifier des modèles d’attaques inconnues.----------ABSTRACT: Cloud Computing systems are becoming more and more complex, dynamic and heterogeneous. Such an environment frequently produces complex and noisy data that make Intrusion Detection Systems (IDSs) unable to detect unknown variants of known attacks. A single intrusion or an attack in such a heterogeneous system could take various forms that are logically but not synthetically similar. This, in turn, makes traditional IDSs unable to identify these attacks, since they are designed for specific and limited infrastructures. Therefore, the accuracy of the detection in the cloud will be very negatively affected. In addition to the problem of the cloud computing environment, cyber attacks are getting more sophisticated and harder to detect. Thus, it is becoming increasingly difficult for a single cloud-based IDS to detect all attacks, because of limited and incomplete knowledge about attacks and implications. The problem of the existing cloud-based IDS solutions is that they overlook the dynamic and changing nature of the cloud. Moreover, they are fundamentally based on the local knowledge and experience to perform the classification of attacks and normal patterns. This renders the cloud vulnerable to “Zero-Day” attacks. To this end, we address throughout this thesis two challenges associated with the cloud-based IDS which are: the detection of cyber attacks under complex, dynamic and heterogeneous environments; and the detection of cyber attacks under limited and/or incomplete information about intrusions and implications. We are interested in this thesis in allowing cloud-based IDSs to be generic, in order to identify intrusions regardless of the infrastructure used. Therefore, whenever an intrusion has been identified, an IDS should be able to recognize all the different structures of such an attack, regardless of the infrastructure that is being used. Moreover, we are interested in allowing cloud-based IDSs to cooperate and share knowledge with each other, in order to make them benefit from each other’s expertise to cover unknown attack patterns. The originality of this thesis lies within two aspects: 1) the design of a generic cloud-based IDS that allows the detection under changing and heterogeneous environments and 2) the design of a multi-cloud cooperative IDS that ensures trustworthiness, fairness and sustainability. By trustworthiness, we mean that the cloud-based IDS should be able to ensure that it will consult, cooperate and share knowledge with trusted parties (i.e., cloud-based IDSs). By fairness, we mean that the cloud-based IDS should be able to guarantee that mutual benefits will be achieved through minimising the chance of cooperating with selfish IDSs. This is useful to give IDSs the motivation to participate in the community
    corecore