458,717 research outputs found

    Built to Last or Built Too Fast? Evaluating Prediction Models for Build Times

    Full text link
    Automated builds are integral to the Continuous Integration (CI) software development practice. In CI, developers are encouraged to integrate early and often. However, long build times can be an issue when integrations are frequent. This research focuses on finding a balance between integrating often and keeping developers productive. We propose and analyze models that can predict the build time of a job. Such models can help developers to better manage their time and tasks. Also, project managers can explore different factors to determine the best setup for a build job that will keep the build wait time to an acceptable level. Software organizations transitioning to CI practices can use the predictive models to anticipate build times before CI is implemented. The research community can modify our predictive models to further understand the factors and relationships affecting build times.Comment: 4 paged version published in the Proceedings of the IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) Pages 487-490. MSR 201

    EnsembleSVM: A Library for Ensemble Learning Using Support Vector Machines

    Full text link
    EnsembleSVM is a free software package containing efficient routines to perform ensemble learning with support vector machine (SVM) base models. It currently offers ensemble methods based on binary SVM models. Our implementation avoids duplicate storage and evaluation of support vectors which are shared between constituent models. Experimental results show that using ensemble approaches can drastically reduce training complexity while maintaining high predictive accuracy. The EnsembleSVM software package is freely available online at http://esat.kuleuven.be/stadius/ensemblesvm.Comment: 5 pages, 1 tabl

    Development and validation of computational models of cellular interaction

    Get PDF
    In this paper we take the view that computational models of biological systems should satisfy two conditions – they should be able to predict function at a systems biology level, and robust techniques of validation against biological models must be available. A modelling paradigm for developing a predictive computational model of cellular interaction is described, and methods of providing robust validation against biological models are explored, followed by a consideration of software issues

    Search-Based Predictive Modelling for Software Engineering: How Far Have We Gone?

    Get PDF
    In this keynote I introduce the use of Predictive Analytics for Software Engineering (SE) and then focus on the use of search-based heuristics to tackle long-standing SE prediction problems including (but not limited to) software development effort estimation and software defect prediction. I review recent research in Search-Based Predictive Modelling for SE in order to assess the maturity of the field and point out promising research directions. I conclude my keynote by discussing best practices for a rigorous and realistic empirical evaluation of search-based predictive models, a condicio sine qua non to facilitate the adoption of prediction models in software industry practices.Predictive analytics Predictive modelling Search-based software engineering Machine learning Software analytic

    A Hybrid Multi-Filter Wrapper Feature Selection Method for Software Defect Predictors

    Get PDF
    Software Defect Prediction (SDP) is an approach used for identifying defect-prone software modules or components. It helps software engineer to optimally, allocate limited resources to defective software modules or components in the testing or maintenance phases of software development life cycle (SDLC). Nonetheless, the predictive performance of SDP models reckons largely on the quality of dataset utilized for training the predictive models. The high dimensionality of software metric features has been noted as a data quality problem which negatively affects the predictive performance of SDP models. Feature Selection (FS) is a well-known method for solving high dimensionality problem and can be divided into filter-based and wrapper-based methods. Filter-based FS has low computational cost, but the predictive performance of its classification algorithm on the filtered data cannot be guaranteed. On the contrary, wrapper-based FS have good predictive performance but with high computational cost and lack of generalizability. Therefore, this study proposes a hybrid multi-filter wrapper method for feature selection of relevant and irredundant features in software defect prediction. The proposed hybrid feature selection will be developed to take advantage of filter-filter and filter-wrapper relationships to give optimal feature subsets, reduce its evaluation cycle and subsequently improve SDP models overall predictive performance in terms of Accuracy, Precision and Recall values

    Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data

    Full text link
    Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known and users quickly install those patches as soon as they are available. However, most vulnerabilities are never actually exploited. Since writing, testing, and installing software patches can involve considerable resources, it would be desirable to prioritize the remediation of vulnerabilities that are likely to be exploited. Several published research studies have reported moderate success in applying machine learning techniques to the task of predicting whether a vulnerability will be exploited. These approaches typically use features derived from vulnerability databases (such as the summary text describing the vulnerability) or social media posts that mention the vulnerability by name. However, these prior studies share multiple methodological shortcomings that inflate predictive power of these approaches. We replicate key portions of the prior work, compare their approaches, and show how selection of training and test data critically affect the estimated performance of predictive models. The results of this study point to important methodological considerations that should be taken into account so that results reflect real-world utility
    • …
    corecore