1,167 research outputs found

    An Assessment of Eclipse Bugs' Priority and Severity Prediction Using Machine Learning

    Get PDF
    The reliability and quality of software programs remains to be an important and challenging aspect of software design. Software developers and system operators spend huge time on assessing and overcoming expected and unexpected errors that might affect the users’ experience negatively. One of the major concerns in developing software problems is the bug reports, which contains the severity and priority of these defects. For a long time, this task was performed manually with huge effort and time consumptions by system operators. Therefore, in this paper, we present a novel automatic assessment tool using Machine Learning algorithms, for assessing bugs’ reports based on several features such as hardware, product, assignee, OS, component, target milestone, votes, and versions.  The aim is to build a tool that automatically classifies software bugs according to the severity and priority of the bugs and makes predictions based on the most representative features and bug report text. To perform this task, we used the Multi-Nominal Naive Bayes, Random Forests Classifier, Bagging, Ada Boosting, SVC, KNN, and Linear SVM Classifiers and Natural Language Processing techniques to analyze the Eclipse dataset. The approach shows promising results for software bugs’ detection and prediction

    Ground truth deficiencies in software engineering: when codifying the past can be counterproductive

    Get PDF
    Many software engineering tools build and evaluate their models based on historical data to support development and process decisions. These models help us answer numerous interesting questions, but have their own caveats. In a real-life setting, the objective function of human decision-makers for a given task might be influenced by a whole host of factors that stem from their cognitive biases, subverting the ideal objective function required for an optimally functioning system. Relying on this data as ground truth may give rise to systems that end up automating software engineering decisions by mimicking past sub-optimal behaviour. We illustrate this phenomenon and suggest mitigation strategies to raise awareness

    Method-Level Bug Severity Prediction using Source Code Metrics and LLMs

    Full text link
    In the past couple of decades, significant research efforts are devoted to the prediction of software bugs. However, most existing work in this domain treats all bugs the same, which is not the case in practice. It is important for a defect prediction method to estimate the severity of the identified bugs so that the higher-severity ones get immediate attention. In this study, we investigate source code metrics, source code representation using large language models (LLMs), and their combination in predicting bug severity labels of two prominent datasets. We leverage several source metrics at method-level granularity to train eight different machine-learning models. Our results suggest that Decision Tree and Random Forest models outperform other models regarding our several evaluation metrics. We then use the pre-trained CodeBERT LLM to study the source code representations' effectiveness in predicting bug severity. CodeBERT finetuning improves the bug severity prediction results significantly in the range of 29%-140% for several evaluation metrics, compared to the best classic prediction model on source code metric. Finally, we integrate source code metrics into CodeBERT as an additional input, using our two proposed architectures, which both enhance the CodeBERT model effectiveness

    A Comparison of Machine Learning Models to Prioritise Emails using Emotion Analysis for Customer Service Excellence

    Get PDF
    There has been little research on machine learning for email prioritization for customer service excellence. To fill this gap, we propose and assess the efficacy of various machine learning techniques for classifying emails into three degrees of priority: high, low, and neutral, based on the emotions inherent in the email content. It is predicted that after emails are classified into those three categories, recipients will be able to respond to emails more efficiently and provide better customer service. We use the NRC Emotion Lexicon to construct a labeled email dataset of 517,401 messages for our proposal. Following that, we train and test four prominent machine learning models, MNB, SVM, LogR, and RF, and an Ensemble of MNB, LSVC, and RF classifiers, on the labeled dataset. Our main findings suggest that machine learning may be used to classify emails based on their emotional content. However, some models outperform others. During the testing phase, we also discovered that the LogR and LSVC models performed the best, with an accuracy of 72%, while the MNB classifier performed the poorest. Furthermore, classification performance differed depending on whether the dataset was balanced or imbalanced. We conclude that machine learning models that employ emotions for email classification are a promising avenue that should be explored further

    Analysis of Human Affect and Bug Patterns to Improve Software Quality and Security

    Get PDF
    The impact of software is ever increasing as more and more systems are being software operated. Despite the usefulness of software, many instances software failures have been causing tremendous losses in lives and dollars. Software failures take place because of bugs (i.e., faults) in the software systems. These bugs cause the program to malfunction or crash and expose security vulnerabilities exploitable by malicious hackers. Studies confirm that software defects and vulnerabilities appear in source code largely due to the human mistakes and errors of the developers. Human performance is impacted by the underlying development process and human affects, such as sentiment and emotion. This thesis examines these human affects of software developers, which have drawn recent interests in the community. For capturing developers’ sentimental and emotional states, we have developed several software tools (i.e., SentiStrength-SE, DEVA, and MarValous). These are novel tools facilitating automatic detection of sentiments and emotions from the software engineering textual artifacts. Using such an automated tool, the developers’ sentimental variations are studied with respect to the underlying development tasks (e.g., bug-fixing, bug-introducing), development periods (i.e., days and times), team sizes and project sizes. We expose opportunities for exploiting developers’ sentiments for higher productivity and improved software quality. While developers’ sentiments and emotions can be leveraged for proactive and active safeguard in identifying and minimizing software bugs, this dissertation also includes in-depth studies of the relationship among various bug patterns, such as software defects, security vulnerabilities, and code smells to find actionable insights in minimizing software bugs and improving software quality and security. Bug patterns are exposed through mining software repositories and bug databases. These bug patterns are crucial in localizing bugs and security vulnerabilities in software codebase for fixing them, predicting portions of software susceptible to failure or exploitation by hackers, devising techniques for automated program repair, and avoiding code constructs and coding idioms that are bug-prone. The software tools produced from this thesis are empirically evaluated using standard measurement metrics (e.g., precision, recall). The findings of all the studies are validated with appropriate tests for statistical significance. Finally, based on our experience and in-depth analysis of the present state of the art, we expose avenues for further research and development towards a holistic approach for developing improved and secure software systems

    Modeling Crowd Feedback in the Mobile App Market

    Get PDF
    Mobile application (app) stores, such as Google Play and the Apple App Store, have recently emerged as a new model of online distribution platform. These stores have expanded in size in the past five years to host millions of apps, offering end-users of mobile software virtually unlimited options to choose from. In such a competitive market, no app is too big to fail. In fact, recent evidence has shown that most apps lose their users within the first 90 days after initial release. Therefore, app developers have to remain up-to-date with their end-users’ needs in order to survive. Staying close to the user not only minimizes the risk of failure, but also serves as a key factor in achieving market competitiveness as well as managing and sustaining innovation. However, establishing effective communication channels with app users can be a very challenging and demanding process. Specifically, users\u27 needs are often tacit, embedded in the complex interplay between the user, system, and market components of the mobile app ecosystem. Furthermore, such needs are scattered over multiple channels of feedback, such as app store reviews and social media platforms. To address these challenges, in this dissertation, we incorporate methods of requirements modeling, data mining, domain engineering, and market analysis to develop a novel set of algorithms and tools for automatically classifying, synthesizing, and modeling the crowd\u27s feedback in the mobile app market. Our analysis includes a set of empirical investigations and case studies, utilizing multiple large-scale datasets of mobile user data, in order to devise, calibrate, and validate our algorithms and tools. The main objective is to introduce a new form of crowd-driven software models that can be used by app developers to effectively identify and prioritize their end-users\u27 concerns, develop apps to meet these concerns, and uncover optimized pathways of survival in the mobile app ecosystem

    Towards Developing and Analysing Metric-Based Software Defect Severity Prediction Model

    Full text link
    In a critical software system, the testers have to spend an enormous amount of time and effort to maintain the software due to the continuous occurrence of defects. Among such defects, some severe defects may adversely affect the software. To reduce the time and effort of a tester, many machine learning models have been proposed in the literature, which use the documented defect reports to automatically predict the severity of the defective software modules. In contrast to the traditional approaches, in this work we propose a metric-based software defect severity prediction (SDSP) model that uses a self-training semi-supervised learning approach to classify the severity of the defective software modules. The approach is constructed on a mixture of unlabelled and labelled defect severity data. The self-training works on the basis of a decision tree classifier to assign the pseudo-class labels to the unlabelled instances. The predictions are promising since the self-training successfully assigns the suitable class labels to the unlabelled instances. On the other hand, numerous research studies have covered proposing prediction approaches as well as the methodological aspects of defect severity prediction models, the gap in estimating project attributes from the prediction model remains unresolved. To bridge the gap, we propose five project specific measures such as the Risk-Factor (RF), the Percent of Saved Budget (PSB), the Loss in the Saved Budget (LSB), the Remaining Service Time (RST) and Gratuitous Service Time (GST) to capture project outcomes from the predictions. Similar to the traditional measures, these measures are also calculated from the observed confusion matrix. These measures are used to analyse the impact that the prediction model has on the software project
    • …
    corecore