952 research outputs found

    Protecting your software updates

    Get PDF
    As described in many blog posts and the scientific literature, exploits for software vulnerabilities are often engineered on the basis of patches, which often involves the manual or automated identification of vulnerable code. The authors evaluate how this identification can be automated with the most frequently referenced diffing tools, demonstrating that for certain types of patches, these tools are indeed effective attacker tools. But they also demonstrate that by using binary code diversification, the effectiveness of the tools can be diminished severely, thus severely closing the attacker's window of opportunity

    Attack graph approach to dynamic network vulnerability analysis and countermeasures

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyIt is widely accepted that modern computer networks (often presented as a heterogeneous collection of functioning organisations, applications, software, and hardware) contain vulnerabilities. This research proposes a new methodology to compute a dynamic severity cost for each state. Here a state refers to the behaviour of a system during an attack; an example of a state is where an attacker could influence the information on an application to alter the credentials. This is performed by utilising a modified variant of the Common Vulnerability Scoring System (CVSS), referred to as a Dynamic Vulnerability Scoring System (DVSS). This calculates scores of intrinsic, time-based, and ecological metrics by combining related sub-scores and modelling the problem’s parameters into a mathematical framework to develop a unique severity cost. The individual static nature of CVSS affects the scoring value, so the author has adapted a novel model to produce a DVSS metric that is more precise and efficient. In this approach, different parameters are used to compute the final scores determined from a number of parameters including network architecture, device setting, and the impact of vulnerability interactions. An attack graph (AG) is a security model representing the chains of vulnerability exploits in a network. A number of researchers have acknowledged the attack graph visual complexity and a lack of in-depth understanding. Current attack graph tools are constrained to only limited attributes or even rely on hand-generated input. The automatic formation of vulnerability information has been troublesome and vulnerability descriptions are frequently created by hand, or based on limited data. The network architectures and configurations along with the interactions between the individual vulnerabilities are considered in the method of computing the Cost using the DVSS and a dynamic cost-centric framework. A new methodology was built up to present an attack graph with a dynamic cost metric based on DVSS and also a novel methodology to estimate and represent the cost-centric approach for each host’ states was followed out. A framework is carried out on a test network, using the Nessus scanner to detect known vulnerabilities, implement these results and to build and represent the dynamic cost centric attack graph using ranking algorithms (in a standardised fashion to Mehta et al. 2006 and Kijsanayothin, 2010). However, instead of using vulnerabilities for each host, a CostRank Markov Model has developed utilising a novel cost-centric approach, thereby reducing the complexity in the attack graph and reducing the problem of visibility. An analogous parallel algorithm is developed to implement CostRank. The reason for developing a parallel CostRank Algorithm is to expedite the states ranking calculations for the increasing number of hosts and/or vulnerabilities. In the same way, the author intends to secure large scale networks that require fast and reliable computing to calculate the ranking of enormous graphs with thousands of vertices (states) and millions of arcs (representing an action to move from one state to another). In this proposed approach, the focus on a parallel CostRank computational architecture to appraise the enhancement in CostRank calculations and scalability of of the algorithm. In particular, a partitioning of input data, graph files and ranking vectors with a load balancing technique can enhance the performance and scalability of CostRank computations in parallel. A practical model of analogous CostRank parallel calculation is undertaken, resulting in a substantial decrease in calculations communication levels and in iteration time. The results are presented in an analytical approach in terms of scalability, efficiency, memory usage, speed up and input/output rates. Finally, a countermeasures model is developed to protect against network attacks by using a Dynamic Countermeasures Attack Tree (DCAT). The following scheme is used to build DCAT tree (i) using scalable parallel CostRank Algorithm to determine the critical asset, that system administrators need to protect; (ii) Track the Nessus scanner to determine the vulnerabilities associated with the asset using the dynamic cost centric framework and DVSS; (iii) Check out all published mitigations for all vulnerabilities. (iv) Assess how well the security solution mitigates those risks; (v) Assess DCAT algorithm in terms of effective security cost, probability and cost/benefit analysis to reduce the total impact of a specific vulnerability

    A Survey of Learning-based Automated Program Repair

    Full text link
    Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance. With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories. Such learning-based techniques usually treat APR as a neural machine translation (NMT) task, where buggy code snippets (i.e., source language) are translated into fixed code snippets (i.e., target language) automatically. Benefiting from the powerful capability of DL to learn hidden relationships from previous bug-fixing datasets, learning-based APR techniques have achieved remarkable performance. In this paper, we provide a systematic survey to summarize the current state-of-the-art research in the learning-based APR community. We illustrate the general workflow of learning-based APR techniques and detail the crucial components, including fault localization, patch generation, patch ranking, patch validation, and patch correctness phases. We then discuss the widely-adopted datasets and evaluation metrics and outline existing empirical studies. We discuss several critical aspects of learning-based APR techniques, such as repair domains, industrial deployment, and the open science issue. We highlight several practical guidelines on applying DL techniques for future APR studies, such as exploring explainable patch generation and utilizing code features. Overall, our paper can help researchers gain a comprehensive understanding about the achievements of the existing learning-based APR techniques and promote the practical application of these techniques. Our artifacts are publicly available at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}

    Constructing a vulnerability knowledge graph

    Get PDF
    Attackers exploiting vulnerabilities in software can cause severe damage to affected victims. Despite continuous efforts of security experts, the number of reported vulnerabilities is increasing. As of January 2022, the National Vulnerability Database consists of more than 160 000 vulnerability records of known vulnerabilities. These vulnerability records contain data such as vulnerability classification, severity metrics, affected software products, and textual descriptions describing the vulnerability. The National Vulnerability Database provides a high-quality data source for security analysts learning about known vulnerabilities. However, maintaining this database comes at a high labor cost for the security experts involved. Knowledge graphs is a semantic technology which has the potential to aid in this task. In our work we explore how knowledge graphs are used in the broader field of cyber security. We then propose our own vulnerability knowledge graph for vulnerability assessment where we combine techniques from NLP with Knowledge graph embedding. Although future work on constructing ground truth data is necessary to evaluate and benchmark our experiments, our initial results show entity prediction results of 0.76 in Hits@10 score.M-D

    Tracing Vulnerabilities Across Product Releases

    Get PDF
    When a software development team becomes aware of a vulnerability, it generally only knows that the last version of that software product is vulnerable. However, today most software products have more than one version being actively used at a time. Garnering information on which versions contain a vulnerability, and which do not, is crucial for the users, to know which versions of a software product are safe to use, and also for the developers, to know where to apply the patch. The patch, i.e. the fix of the vulnerability, contains valuable information in the form of changes made to the known vulnerable code to fix it. This information could be leveraged to analyze the presence of this known vulnerability across releases of a software product. The problem of tracing vulnerabilities in different releases has been addressed in two separate research projects. Both of these projects rely on the changed lines of code to fix a vulnerability, and conclude whether a version is vulnerable or not based on the presence of these lines of code. However, relying simply on lines of code fails to consider the changes in the source code context where the patch has been introduced from a version to a version. In addressing this problem, this research project will focus on representing the patch and the versions to be evaluated in a more flexible format such as an Abstract Syntax Tree (AST). This approach is more robust compared to the line-based approach, because ASTs abstract away these changes in the context and allow us to focus more efficiently on the structure and behavior of the code in the patch. As such, instead of using lines of code, the unit of comparison in our approach will be nodes in an AST. Moreover, our approach will generate comprehensive artifacts that could guide developers to more efficiently patch the different versions of their product. We implemented our approach in a Java tool named Patchilyzer and we tested it in 174 Tomcat versions for a total of 39 vulnerabilities

    A Survey on Automated Software Vulnerability Detection Using Machine Learning and Deep Learning

    Full text link
    Software vulnerability detection is critical in software security because it identifies potential bugs in software systems, enabling immediate remediation and mitigation measures to be implemented before they may be exploited. Automatic vulnerability identification is important because it can evaluate large codebases more efficiently than manual code auditing. Many Machine Learning (ML) and Deep Learning (DL) based models for detecting vulnerabilities in source code have been presented in recent years. However, a survey that summarises, classifies, and analyses the application of ML/DL models for vulnerability detection is missing. It may be difficult to discover gaps in existing research and potential for future improvement without a comprehensive survey. This could result in essential areas of research being overlooked or under-represented, leading to a skewed understanding of the state of the art in vulnerability detection. This work address that gap by presenting a systematic survey to characterize various features of ML/DL-based source code level software vulnerability detection approaches via five primary research questions (RQs). Specifically, our RQ1 examines the trend of publications that leverage ML/DL for vulnerability detection, including the evolution of research and the distribution of publication venues. RQ2 describes vulnerability datasets used by existing ML/DL-based models, including their sources, types, and representations, as well as analyses of the embedding techniques used by these approaches. RQ3 explores the model architectures and design assumptions of ML/DL-based vulnerability detection approaches. RQ4 summarises the type and frequency of vulnerabilities that are covered by existing studies. Lastly, RQ5 presents a list of current challenges to be researched and an outline of a potential research roadmap that highlights crucial opportunities for future work

    Toward Data-Driven Discovery of Software Vulnerabilities

    Get PDF
    Over the years, Software Engineering, as a discipline, has recognized the potential for engineers to make mistakes and has incorporated processes to prevent such mistakes from becoming exploitable vulnerabilities. These processes span the spectrum from using unit/integration/fuzz testing, static/dynamic/hybrid analysis, and (automatic) patching to discover instances of vulnerabilities to leveraging data mining and machine learning to collect metrics that characterize attributes indicative of vulnerabilities. Among these processes, metrics have the potential to uncover systemic problems in the product, process, or people that could lead to vulnerabilities being introduced, rather than identifying specific instances of vulnerabilities. The insights from metrics can be used to support developers and managers in making decisions to improve the product, process, and/or people with the goal of engineering secure software. Despite empirical evidence of metrics\u27 association with historical software vulnerabilities, their adoption in the software development industry has been limited. The level of granularity at which the metrics are defined, the high false positive rate from models that use the metrics as explanatory variables, and, more importantly, the difficulty in deriving actionable intelligence from the metrics are often cited as factors that inhibit metrics\u27 adoption in practice. Our research vision is to assist software engineers in building secure software by providing a technique that generates scientific, interpretable, and actionable feedback on security as the software evolves. In this dissertation, we present our approach toward achieving this vision through (1) systematization of vulnerability discovery metrics literature, (2) unsupervised generation of metrics-informed security feedback, and (3) continuous developer-in-the-loop improvement of the feedback. We systematically reviewed the literature to enumerate metrics that have been proposed and/or evaluated to be indicative of vulnerabilities in software and to identify the validation criteria used to assess the decision-informing ability of these metrics. In addition to enumerating the metrics, we implemented a subset of these metrics as containerized microservices. We collected the metric values from six large open-source projects and assessed metrics\u27 generalizability across projects, application domains, and programming languages. We then used an unsupervised approach from literature to compute threshold values for each metric and assessed the thresholds\u27 ability to classify risk from historical vulnerabilities. We used the metrics\u27 values, thresholds, and interpretation to provide developers natural language feedback on security as they contributed changes and used a survey to assess their perception of the feedback. We initiated an open dialogue to gain an insight into their expectations from such feedback. In response to developer comments, we assessed the effectiveness of an existing vulnerability discovery approach—static analysis—and that of vulnerability discovery metrics in identifying risk from vulnerability contributing commits

    A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities

    Get PDF
    The relentless and often haphazard process of tracking and remediating vulnerabilities is a top concern for cybersecurity professionals. The key challenge they face is trying to identify a remediation scheme specific to in-house, organizational objectives. Without a strategy, the result is a patchwork of fixes applied to a tide of vulnerabilities, any one of which could be the single point of failure in an otherwise formidable defense. This means one of the biggest challenges in vulnerability management relates to prioritization. Given that so few vulnerabilities are a focus of real-world attacks, a practical remediation strategy is to identify vulnerabilities likely to be exploited and focus efforts towards remediating those vulnerabilities first. The goal of this research is to demonstrate that aggregating and synthesizing readily accessible, public data sources to provide personalized, automated recommendations that an organization can use to prioritize its vulnerability management strategy will offer significant improvements over what is currently realized using the Common Vulnerability Scoring System (CVSS). We provide a framework for vulnerability management specifically focused on mitigating threats using adversary criteria derived from MITRE ATT&CK. We identify the data mining steps needed to acquire, standardize, and integrate publicly available cyber intelligence data sets into a robust knowledge graph from which stakeholders can infer business logic related to known threats. We tested our approach by identifying vulnerabilities in academic and common software associated with six universities and four government facilities. Ranking policy performance was measured using the Normalized Discounted Cumulative Gain (nDCG). Our results show an average 71.5% to 91.3% improvement towards the identification of vulnerabilities likely to be targeted and exploited by cyber threat actors. The ROI of patching using our policies resulted in a savings in the range of 23.3% to 25.5% in annualized unit costs. Our results demonstrate the efficiency of creating knowledge graphs to link large data sets to facilitate semantic queries and create data-driven, flexible ranking policies. Additionally, our framework uses only open standards, making implementation and improvement feasible for cyber practitioners and academia

    Multi-Granularity Detector for Vulnerability Fixes

    Full text link
    With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes). Unique from prior works, Midas constructs different neural networks for each level of code change granularity, corresponding to commit-level, file-level, hunk-level, and line-level, following their natural organization. It then utilizes an ensemble model that combines all base models to generate the final prediction. This design allows MiDas to better handle the noisy and highly imbalanced nature of vulnerability-fixing commit data. Additionally, to reduce the human effort required to inspect code changes, we have designed an effort-aware adjustment for Midas's outputs based on commit length. The evaluation results demonstrate that MiDas outperforms the current state-of-the-art baseline in terms of AUC by 4.9% and 13.7% on Java and Python-based datasets, respectively. Furthermore, in terms of two effort-aware metrics, EffortCost@L and Popt@L, MiDas also outperforms the state-of-the-art baseline, achieving improvements of up to 28.2% and 15.9% on Java, and 60% and 51.4% on Python, respectively
    • …
    corecore