4,548 research outputs found

    Adversarial Robustness of Hybrid Machine Learning Architecture for Malware Classification

    Get PDF
    The detection heuristic in contemporary machine learning Windows malware classifiers is typically based on the static properties of the sample. In contrast, simultaneous utilization of static and behavioral telemetry is vaguely explored. We propose a hybrid model that employs dynamic malware analysis techniques, contextual information as an executable filesystem path on the system, and static representations used in modern state-of-the-art detectors. It does not require an operating system virtualization platform. Instead, it relies on kernel emulation for dynamic analysis. Our model reports enhanced detection heuristic and identify malicious samples, even if none of the separate models express high confidence in categorizing the file as malevolent. For instance, given the 0.05%0.05\% false positive rate, individual static, dynamic, and contextual model detection rates are 18.04%18.04\%, 37.20%37.20\%, and 15.66%15.66\%. However, we show that composite processing of all three achieves a detection rate of 96.54%96.54\%, above the cumulative performance of individual components. Moreover, simultaneous use of distinct malware analysis techniques address independent unit weaknesses, minimizing false positives and increasing adversarial robustness. Our experiments show a decrease in contemporary adversarial attack evasion rates from 26.06%26.06\% to 0.35%0.35\% when behavioral and contextual representations of sample are employed in detection heuristic

    Privacy and Robustness in Federated Learning: Attacks and Defenses

    Full text link
    As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.Comment: arXiv admin note: text overlap with arXiv:2003.02133; text overlap with arXiv:1911.11815 by other author

    A taxonomy of attacks and a survey of defence mechanisms for semantic social engineering attacks

    Get PDF
    Social engineering is used as an umbrella term for a broad spectrum of computer exploitations that employ a variety of attack vectors and strategies to psychologically manipulate a user. Semantic attacks are the specific type of social engineering attacks that bypass technical defences by actively manipulating object characteristics, such as platform or system applications, to deceive rather than directly attack the user. Commonly observed examples include obfuscated URLs, phishing emails, drive-by downloads, spoofed web- sites and scareware to name a few. This paper presents a taxonomy of semantic attacks, as well as a survey of applicable defences. By contrasting the threat landscape and the associated mitigation techniques in a single comparative matrix, we identify the areas where further research can be particularly beneficial

    A composable approach to design of newer techniques for large-scale denial-of-service attack attribution

    Get PDF
    Since its early days, the Internet has witnessed not only a phenomenal growth, but also a large number of security attacks, and in recent years, denial-of-service (DoS) attacks have emerged as one of the top threats. The stateless and destination-oriented Internet routing combined with the ability to harness a large number of compromised machines and the relative ease and low costs of launching such attacks has made this a hard problem to address. Additionally, the myriad requirements of scalability, incremental deployment, adequate user privacy protections, and appropriate economic incentives has further complicated the design of DDoS defense mechanisms. While the many research proposals to date have focussed differently on prevention, mitigation, or traceback of DDoS attacks, the lack of a comprehensive approach satisfying the different design criteria for successful attack attribution is indeed disturbing. Our first contribution here has been the design of a composable data model that has helped us represent the various dimensions of the attack attribution problem, particularly the performance attributes of accuracy, effectiveness, speed and overhead, as orthogonal and mutually independent design considerations. We have then designed custom optimizations along each of these dimensions, and have further integrated them into a single composite model, to provide strong performance guarantees. Thus, the proposed model has given us a single framework that can not only address the individual shortcomings of the various known attack attribution techniques, but also provide a more wholesome counter-measure against DDoS attacks. Our second contribution here has been a concrete implementation based on the proposed composable data model, having adopted a graph-theoretic approach to identify and subsequently stitch together individual edge fragments in the Internet graph to reveal the true routing path of any network data packet. The proposed approach has been analyzed through theoretical and experimental evaluation across multiple metrics, including scalability, incremental deployment, speed and efficiency of the distributed algorithm, and finally the total overhead associated with its deployment. We have thereby shown that it is realistically feasible to provide strong performance and scalability guarantees for Internet-wide attack attribution. Our third contribution here has further advanced the state of the art by directly identifying individual path fragments in the Internet graph, having adopted a distributed divide-and-conquer approach employing simple recurrence relations as individual building blocks. A detailed analysis of the proposed approach on real-life Internet topologies with respect to network storage and traffic overhead, has provided a more realistic characterization. Thus, not only does the proposed approach lend well for simplified operations at scale but can also provide robust network-wide performance and security guarantees for Internet-wide attack attribution. Our final contribution here has introduced the notion of anonymity in the overall attack attribution process to significantly broaden its scope. The highly invasive nature of wide-spread data gathering for network traceback continues to violate one of the key principles of Internet use today - the ability to stay anonymous and operate freely without retribution. In this regard, we have successfully reconciled these mutually divergent requirements to make it not only economically feasible and politically viable but also socially acceptable. This work opens up several directions for future research - analysis of existing attack attribution techniques to identify further scope for improvements, incorporation of newer attributes into the design framework of the composable data model abstraction, and finally design of newer attack attribution techniques that comprehensively integrate the various attack prevention, mitigation and traceback techniques in an efficient manner

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    SQL Injection Vulnerability Detection Using Deep Learning: A Feature-based Approach

    Get PDF
    SQL injection (SQLi), a well-known exploitation technique, is a serious risk factor for database-driven web applications that are used to manage the core business functions of organizations. SQLi enables an unauthorized user to get access to sensitive information of the database, and subsequently, to the application’s administrative privileges. Therefore, the detection of SQLi is crucial for businesses to prevent financial losses. There are different rules and learning-based solutions to help with detection, and pattern recognition through support vector machines (SVMs) and random forest (RF) have recently become popular in detecting SQLi. However, these classifiers ensure 97.33% accuracy with our dataset. In this paper, we propose a deep learning-based solution for detecting SQLi in web applications. The solution employs both correlation and chi-squared methods to rank the features from the dataset. Feed-forward network approach has been applied not only in feature selection but also in the detection process. Our solution provides 98.04% accuracy over 1,850+ recorded datasets, where it proves its superior efficiency among other existing machine learning solutions
    • …
    corecore