63 research outputs found

    A Survey on Bug Triage Using Data Reduction Technique

    Get PDF
    Most of the software companies needs to deal with software bug in every day. Software companies spend most if their cost in dealing with software bugs. The process of fixing bug is bug triage, which aims to assign a expert developer to a new bug. To reduce the time and cost in manual work, we apply text classification technique to conduct automatic bug triage. In proposed system we apply data reduction techniques on bug data set to improve the scale and quality of bug data. We use instance selection and feature selection simultaneously to reduce the scales on bug dimension and word dimension and improve the accuracy of bug triage. In this paper, we investigate the use of five term selection methods on the accuracy of bug assignment. In addition, we re-balance the load between developers based on their experience

    A Novel Way of Assessing Software Bug Severity Using Dictionary of Critical Terms

    Get PDF
    AbstractDue to increase in demands of software and decreased delivery span of software, assuring the quality of software is becoming a challenge. However, no software can claim to be error free due to the complexity of software and inadequate testing. There is a well-known principle of testing, which states that exhaustive testing is impossible. Hence, maintenance activities are required to ensure smooth functioning of the software. Many open source software provides bug tracking systems to aid corrective maintenance task. These bug tracking systems allow users to report the bugs that are encountered while operating the software. However, in software maintenance, severity prediction has gained much attention recently. Bugs having higher severity should be fixed prior to the bugs having lesser severity. Triager analyzes the bug reports and assesses the severity based upon his/her knowledge and experience. But due to the presence of a large number of bug reports, it becomes a tedious job to manually assign severity. Thus, there is growing need for making the whole process of severity prediction automatic. The paper presents an approach of creating a dictionary of critical terms specifying severity using two different feature selection methods, namely- info gain and Chi square and classification of bug reports are performed using Naïve Bayes Multinomial (NBM) and K-nearest neighbor (KNN) algorithms

    Automatic classification of software related microblogs

    Get PDF
    Abstract—Millions of people, including those in the soft-ware engineering communities have turned to microblogging services, such as Twitter, as a means to quickly disseminate information. A number of past studies by Treude et al., Storey, and Yuan et al. have shown that a wealth of interesting information is stored in these microblogs. However, microblogs also contain a large amount of noisy content that are less relevant to software developers in engineering software systems. In this work, we perform a preliminary study to investigate the feasibility of automatic classification of microblogs into two categories: relevant and irrelevant to engineering software systems. We extract features from the textual content of the microblogs and the titles of any URLs mentioned in the mi-croblogs. These features are then used to learn a discriminative model used in classifying relevant and irrelevant microblogs. We show that our trained model can achieve a promising classification performance. I

    Information gain based dimensionality selection for classifying text documents

    Full text link
    Selecting the optimal dimensions for various knowledge extraction applications is an essential component of data mining. Dimensionality selection techniques are utilized in classification applications to increase the classification accuracy and reduce the computational complexity. In text classification, where the dimensionality of the dataset is extremely high, dimensionality selection is even more important. This paper presents a novel, genetic algorithm based methodology, for dimensionality selection in text mining applications that utilizes information gain. The presented methodology uses information gain of each dimension to change the mutation probability of chromosomes dynamically. Since the information gain is calculated a priori, the computational complexity is not affected. The presented method was tested on a specific text classification problem and compared with conventional genetic algorithm based dimensionality selection. The results show an improvement of 3% in the true positives and 1.6% in the true negatives over conventional dimensionality selection methods

    Information Gain Based Dimensionality Selection for Classifying Text Documents

    Get PDF
    Abstract-Selecting the optimal dimensions for various knowledge extraction applications is an essential component of data mining. Dimensionality selection techniques are utilized in classification applications to increase the classification accuracy and reduce the computational complexity. In text classification, where the dimensionality of the dataset is extremely high, dimensionality selection is even more important. This paper presents a novel, genetic algorithm based methodology, for dimensionality selection in text mining applications that utilizes information gain. The presented methodology uses information gain of each dimension to change the mutation probability of chromosomes dynamically. Since the information gain is calculated a priori, the computational complexity is not affected. The presented method was tested on a specific text classification problem and compared with conventional genetic algorithm based dimensionality selection. The results show an improvement of 3% in the true positives and 1.6% in the true negatives over conventional dimensionality selection methods

    Evaluation of Stratified K-Fold Cross Validation for Predicting Bug Severity in Game Review Classification

    Get PDF
    Steam review data provides a lot of information for the game development team, either positive or negative reviews. It is essential as negative and positive reviews provide crucial information, and 7% of positive reviews contains bug reports. These bug reports were captured after the game was released, and many reports of common problems still exist. If players found an issue in the game, they could report it directly through the review feature provided by the online game platform. However, it took a long time for the development team to manually analyze and classify the reviews. This study proposed a new approach to automatically classify the reviews on Steam based on the bug severity level. Therefore, to solve this problem, we recommend a solution based on the research background indicated above. For this experiment, we analyzed reviews on two popular game titles namely, FIFA 23 and Apex Legends. We implemented three different classifiers, namely KNN, Decision Tree, and Naïve Bayes, which would be used to train a dataset to classify the bug severity level. Due to the imbalanced dataset, we performed cross-validation to reduce bias in the dataset.  Performance in this model would be evaluated using accuracy rate, precision, recall, and F1 score. As a result, the experiment showed that game reviews of different game titles achieved different accuracy scores. The game review classification for FIFA 23 performed better than the game review classification for Apex Legends. The mean accuracy score of FIFA 23 was 72% with Decision Tree and Apex Legend was 64% with KNN
    • …
    corecore