2,592 research outputs found
Analyze the Performance of Software by Machine Learning Methods for Fault Prediction Techniques
Trend of using the software in daily life is increasing day by day. Software system development is growing more difficult as these technologies are integrated into daily life. Therefore, creating highly effective software is a significant difficulty. The quality of any software system continues to be the most important element among all the required characteristics. Nearly one-third of the total cost of software development goes toward testing. Therefore, it is always advantageous to find a software bug early in the software development process because if it is not found early, it will drive up the cost of the software development. This type of issue is intended to be resolved via software fault prediction. There is always a need for a better and enhanced prediction model in order to forecast the fault before the real testing and so reduce the flaws in the time and expense of software projects. The various machine learning techniques for classifying software bugs are discussed in this paper
Predictive Analytics and Software Defect Severity: A Systematic Review and Future Directions
Software testing identifies defects in software products with varying multiplying effects based on their severity levels and sequel to instant rectifications, hence the rate of a research study in the software engineering domain. In this paper, a systematic literature review (SLR) on machine learning-based software defect severity prediction was conducted in the last decade. The SLR was aimed at detecting germane areas central to efficient predictive analytics, which are seldom captured in existing software defect severity prediction reviews. The germane areas include the analysis of techniques or approaches which have a significant influence on the threats to the validity of proposed models, and the bias-variance tradeoff considerations techniques in data science-based approaches. A population, intervention, and outcome model is adopted for better search terms during the literature selection process, and subsequent quality assurance scrutiny yielded fifty-two primary studies. A subsequent thoroughbred systematic review was conducted on the final selected studies to answer eleven main research questions, which uncovers approaches that speak to the aforementioned germane areas of interest. The results indicate that while the machine learning approach is ubiquitous for predicting software defect severity, germane techniques central to better predictive analytics are infrequent in literature. This study is concluded by summarizing prominent study trends in a mind map to stimulate future research in the software engineering industry.publishedVersio
A human-centric approach for adopting bug inducing commit detection using machine learning models
When developing new software, testing can take up half of the resources. Although a considerable amount
of work has been done to automate software testing, fixing bugs after adding them to the source repository is
still a costly task from both management and financial perspectives. In recent times, the research community
has proposed various methodologies to detect bugs just-in-time at the commit level. Unfortunately, this
work, including state-of-the-art techniques, do not provide real-time solutions for the problem. Such a
limitation restricts developers from utilizing them in their day-to-day programming tasks. Our study focuses
on providing solutions that deliver real-time support to the developers by warning them about potential
bug-inducing commits. Such support can help developers by preventing them from adding a bug-inducing
commit to the source repository. Keeping this goal in mind, we conducted a developer survey to understand
the expectations of developers for bug-inducing commit detection tools. Motivated by their responses, we
built a GUI-based plug-in that warns the developers when they attempt to perform a potential buggy commit.
We accomplished this by training machine learning models on relevant features. We also built a command-line
tool for the developers who prefer to use a command-line interface. Our proposed solution has been designed
to work with various machine learning models (e.g. random forest, decision tree, and logistic regression) and
IDEs (e.g. Visual Studio, PyCharm, and WebStorm). It enables developers to work with a familiar interface
without leaving the IDE. As a proof of concept, we implemented a VSCode plug-in and an accompanying
command-line tool. Developers can customize these tools by choosing among various machine learning models
and features. Such customizability empowers the developers to understand the toolchain better and lets them
fit it into their specific use cases. Our user study shows that the toolchain offers satisfactory performance
in detecting bug-inducing commits and provides a sound user experience. The decision tree model achieved
the best performance with a 79% accuracy and an f1-score of 0.70 among the tested models. In addition,
we performed a user study with developers working in the software industries to validate the usability of
our toolchain. We found that the users can detect whether a commit is bug-inducing or not within a short
period of time. Furthermore, they prefer our tool over the state-of-the-art to detect potential bugs before
the commit operation. Alongside contributing a new multi-UI toolchain, our work enriches the research
community’s knowledge regarding developer usability of real-time bug detection tools
Toward Data-Driven Discovery of Software Vulnerabilities
Over the years, Software Engineering, as a discipline, has recognized the potential for engineers to make mistakes and has incorporated processes to prevent such mistakes from becoming exploitable vulnerabilities. These processes span the spectrum from using unit/integration/fuzz testing, static/dynamic/hybrid analysis, and (automatic) patching to discover instances of vulnerabilities to leveraging data mining and machine learning to collect metrics that characterize attributes indicative of vulnerabilities. Among these processes, metrics have the potential to uncover systemic problems in the product, process, or people that could lead to vulnerabilities being introduced, rather than identifying specific instances of vulnerabilities. The insights from metrics can be used to support developers and managers in making decisions to improve the product, process, and/or people with the goal of engineering secure software.
Despite empirical evidence of metrics\u27 association with historical software vulnerabilities, their adoption in the software development industry has been limited. The level of granularity at which the metrics are defined, the high false positive rate from models that use the metrics as explanatory variables, and, more importantly, the difficulty in deriving actionable intelligence from the metrics are often cited as factors that inhibit metrics\u27 adoption in practice. Our research vision is to assist software engineers in building secure software by providing a technique that generates scientific, interpretable, and actionable feedback on security as the software evolves. In this dissertation, we present our approach toward achieving this vision through (1) systematization of vulnerability discovery metrics literature, (2) unsupervised generation of metrics-informed security feedback, and (3) continuous developer-in-the-loop improvement of the feedback.
We systematically reviewed the literature to enumerate metrics that have been proposed and/or evaluated to be indicative of vulnerabilities in software and to identify the validation criteria used to assess the decision-informing ability of these metrics. In addition to enumerating the metrics, we implemented a subset of these metrics as containerized microservices. We collected the metric values from six large open-source projects and assessed metrics\u27 generalizability across projects, application domains, and programming languages. We then used an unsupervised approach from literature to compute threshold values for each metric and assessed the thresholds\u27 ability to classify risk from historical vulnerabilities. We used the metrics\u27 values, thresholds, and interpretation to provide developers natural language feedback on security as they contributed changes and used a survey to assess their perception of the feedback. We initiated an open dialogue to gain an insight into their expectations from such feedback. In response to developer comments, we assessed the effectiveness of an existing vulnerability discovery approach—static analysis—and that of vulnerability discovery metrics in identifying risk from vulnerability contributing commits
Improving software engineering processes using machine learning and data mining techniques
The availability of large amounts of data from software development has created an area of research called mining software repositories. Researchers mine data from software repositories both to improve understanding of software development and evolution, and to empirically validate novel ideas and techniques.
The large amount of data collected from software processes can then be leveraged for machine learning applications. Indeed, machine learning can have a large impact in software engineering, just like it has had in other fields, supporting developers, and other actors involved in the software development process, in automating or improving parts of their work. The automation can not only make some phases of the development process less tedious or cheaper, but also more efficient and less prone to errors. Moreover, employing machine learning can reduce the complexity of difficult problems, enabling engineers to focus on more interesting problems rather than the basics of development.
The aim of this dissertation is to show how the development and the use of machine learning and data mining techniques can support several software engineering phases, ranging from crash handling, to code review, to patch uplifting, to software ecosystem management.
To validate our thesis we conducted several studies tackling different problems in an industrial open-source context, focusing on the case of Mozilla
Involving External Stakeholders in Project Courses
Problem: The involvement of external stakeholders in capstone projects and
project courses is desirable due to its potential positive effects on the
students. Capstone projects particularly profit from the inclusion of an
industrial partner to make the project relevant and help students acquire
professional skills. In addition, an increasing push towards education that is
aligned with industry and incorporates industrial partners can be observed.
However, the involvement of external stakeholders in teaching moments can
create friction and could, in the worst case, lead to frustration of all
involved parties. Contribution: We developed a model that allows analysing the
involvement of external stakeholders in university courses both in a
retrospective fashion, to gain insights from past course instances, and in a
constructive fashion, to plan the involvement of external stakeholders. Key
Concepts: The conceptual model and the accompanying guideline guide the
teachers in their analysis of stakeholder involvement. The model is comprised
of several activities (define, execute, and evaluate the collaboration). The
guideline provides questions that the teachers should answer for each of these
activities. In the constructive use, the model allows teachers to define an
action plan based on an analysis of potential stakeholders and the pedagogical
objectives. In the retrospective use, the model allows teachers to identify
issues that appeared during the project and their underlying causes. Drawing
from ideas of the reflective practitioner, the model contains an emphasis on
reflection and interpretation of the observations made by the teacher and other
groups involved in the courses. Key Lessons: Applying the model retrospectively
to a total of eight courses shows that it is possible to reveal hitherto
implicit risks and assumptions and to gain a better insight into the
interaction...Comment: Abstract shortened since arxiv.org limits length of abstracts. See
paper/pdf for full abstract. Paper is forthcoming, accepted August 2017.
Arxiv version 2 corrects misspelled author nam
- …