5 research outputs found

    Mining Bug Databases for Unidentified Software Vulnerabilities

    Full text link
    Identifying software vulnerabilities is becoming more important as critical and sensitive systems increasingly rely on complex software systems. It has been suggested in previous work that some bugs are only identified as vulnerabilities long after the bug has been made public. These vulnerabilities are known as hidden impact vulnerabilities. This paper discusses the feasibility and necessity to mine common publicly available bug databases for vulnerabilities that are yet to be identified. We present bug database analysis of two well known and frequently used software packages, namely Linux kernel and MySQL. It is shown that for both Linux and MySQL, a significant portion of vulnerabilities that were discovered for the time period from January 2006 to April 2011 were hidden impact vulnerabilities. It is also shown that the percentage of hidden impact vulnerabilities has increased in the last two years, for both software packages. We then propose an improved hidden impact vulnerability identification methodology based on text mining bug databases, and conclude by discussing a few potential problems faced by such a classifier

    Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics

    Get PDF
    Bug prediction aims at finding source code elements in a software system that are likely to contain defects. Being aware of the most error-prone parts of the program, one can efficiently allocate the limited amount of testing and code review resources. Therefore, bug prediction can support software maintenance and evolution to a great extent. In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls (HNII and HNOI). Our motivation for this is that JavaScript is a highly dynamic scripting language for which static code analysis might be very imprecise; therefore, using a purely static source code features for bug prediction might not be enough. Based on a study where we extracted 824 buggy and 1943 non-buggy functions from the publicly available BugsJS dataset for the ESLint JavaScript project, we can confirm the positive impact of hybrid code metrics on the prediction performance of the ML models. Depending on the ML algorithm, applied hyper-parameters, and target measures we consider, hybrid invocation metrics bring a 2–10% increase in model performances (i.e., precision, recall, F-measure). Interestingly, replacing static NOI and NII metrics with their hybrid counterparts HNOI and HNII in itself improves model performances; however, using them all together yields the best results

    Mining Static Code Metrics for a Robust Prediction of Software Defect-Proneness

    No full text
    Defect-proneness prediction is affected by multiple aspects including sampling bias, non-metric factors, uncertainty of models etc. These aspects often contribute to prediction uncertainty and result in variance of prediction. This paper proposes two methods of data mining static code metrics to enhance defect-proneness prediction. Given little non-metric or qualitative information extracted from software codes, we first suggest to use a robust unsupervised learning method, shared nearest neighbors (SNN) to extract the similarity patterns of the code metrics. These patterns indicate similar characteristics of the components of the same cluster that may result in introduction of similar defects. Using the similarity patterns with code metrics as predictors, defect-proneness prediction may be improved. The second method uses the Occam's windows and Bayesian model averaging to deal with model uncertainty: first, the datasets are used to train and cross-validate multiple learners and then highly qualified models are selected and integrated into a robust prediction. From a study based on 12 datasets from NASA, we conclude that our proposed solutions can contribute to a better defect-proneness prediction.Department of ComputingRefereed conference pape
    corecore