Search CORE

142 research outputs found

Data-Driven Application Maintenance: Views from the Trenches

Author: Misra Janardan
Podder Sanjay
Rawat Divya
Savagaonkar Milind
Sengupta Shubhashis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2018
Field of study

In this paper we present our experience during design, development, and pilot deployments of a data-driven machine learning based application maintenance solution. We implemented a proof of concept to address a spectrum of interrelated problems encountered in application maintenance projects including duplicate incident ticket identification, assignee recommendation, theme mining, and mapping of incidents to business processes. In the context of IT services, these problems are frequently encountered, yet there is a gap in bringing automation and optimization. Despite long-standing research around mining and analysis of software repositories, such research outputs are not adopted well in practice due to the constraints these solutions impose on the users. We discuss need for designing pragmatic solutions with low barriers to adoption and addressing right level of complexity of problems with respect to underlying business constraints and nature of data.Comment: Earlier version of paper appearing in proceedings of the 4th International Workshop on Software Engineering Research and Industrial Practice (SER&IP), IEEE Press, pp. 48-54, 201

arXiv.org e-Print Archive

Crossref

In Pursuit of Optimal Workflow Within The Apache Software Foundation

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: The following is a case study composed of three workflow investigations at the open source software development (OSSD) based Apache Software Foundation (Apache). I start with an examination of the workload inequality within the Apache, particularly with regard to requirements writing. I established that the stronger a participant's experience indicators are, the more likely they are to propose a requirement that is not a defect and the more likely the requirement is eventually implemented. Requirements at Apache are divided into work tickets (tickets). In our second investigation, I reported many insights into the distribution patterns of these tickets. The participants that create the tickets often had the best track records for determining who should participate in that ticket. Tickets that were at one point volunteered for (self-assigned) had a lower incident of neglect but in some cases were also associated with severe delay. When a participant claims a ticket but postpones the work involved, these tickets exist without a solution for five to ten times as long, depending on the circumstances. I make recommendations that may reduce the incidence of tickets that are claimed but not implemented in a timely manner. After giving an in-depth explanation of how I obtained this data set through web crawlers, I describe the pattern mining platform I developed to make my data mining efforts highly scalable and repeatable. Lastly, I used process mining techniques to show that workflow patterns vary greatly within teams at Apache. I investigated a variety of process choices and how they might be influencing the outcomes of OSSD projects. I report a moderately negative association between how often a team updates the specifics of a requirement and how often requirements are completed. I also verified that the prevalence of volunteerism indicators is positively associated with work completion but what was surprising is that this correlation is stronger if I exclude the very large projects. I suggest the largest projects at Apache may benefit from some level of traditional delegation in addition to the phenomenon of volunteerism that OSSD is normally associated with.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201

ASU Digital Repository

Improved method of searching the associative rules while developing the software

Author: Amirgaliyev Yedilkhan
Pryimak Nataliia V.
Savchuk Tamara O.
Slyusarenko Nina V.
Smailova Saule
Smolarz Andrzej
Publication venue: Electronics and Telecommunications Committee
Publication date: 01/01/2020
Field of study

As the delivery of the good quality software in time is very important part of the software development process, it's very important task to organize this process very accurately. For this a new method of the searching associative rules where proposed. It is based on the classification of the all tasks on three different groups, depending on their difficulty, and after this, searching associative rules among them, which will help to define the time necessary to perform specific task by specific develope

Biblioteka Nauki - repozytorium artykuÅÃ³w

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

Recommending Issue Reports to Developers Using Machine Learning

Author: Cherinet Abel Mesfin
Publication venue
Publication date: 01/01/2019
Field of study

Tarkvarasüsteemide arendust viiakse tihti läbi iteratiivse protsessina ning erinevad tööüleasnded tekkivad siis kui leitakse defekte või tekib vajadus uue funktsionaalsuse järele. Need ülesanded salvestatakse probleemihalduse süsteemi, kust arendajad saavad sisendit oma tööle. Ülesannete jaotamine arendajatele võib toimude mitmel eri viisil. Üks populaarsemaid lähenemisi näeb ette, et arendajad valivad ise ülesandeid, mis neid huvitavad. Suurtes projektides võib see aga muutuda keeruliseks: ülesannete suure arvu tõttu on arendajatel raske aegsasti valida omale huvitav tööülesanne. Selle probleemi leevendamiseks esitatakse antud töös masinõppel põhinev soovitussüsteem, mis on võimeline probleemihalduse süsteemi ajaloost õppima milliseid ülesandeid on iga arendaja eelnevalt täitnud ja selle põhjal soovitada neile uusi ülesandeid. Süsteemi arendamiseks koguti 6 erinevast avatud lähtekoodiga projektist ülesandeid, kasutati erinevaid masinõppe meetodeid ja võrreldi tulemusi, et leida sobivaim. Soovitussüsteemi jõudluse hindamiseks kasutati täpsuse (precision), saagise (recall), f1-skoori (f1-score) ja keskmise täpsuse (mean average precision) mõõdikuid. Tulemused näitavad, et 100 tööülesande kirjelduse põhjal 10 igale arendajale sobivaima soovitamise puhul võib saavutada saagise 52.9% ja 96% vahel, mis on 6 kuni 9.5 korda parem 10 juhusliku töökirjelduse valimisest. Sarnased parandused saavutati ka teistes mõõdikutes.The development of a software system is often done through an iterative process and different change requests arise when bugs and defects are detected or new features need to be added. These requirements are recorded as issue reports and put in the backlog of the software project for developers to work on. The assignment of these issue reports to developers is done in different ways. One common approach is self-assignment, where the developers themselves pick the issue reports they are interested in and assign themselves. Practising self-assignment in large projects can be challenging for developers because the backlog of large projects become loaded with many issue reports, which makes it hard for developers to filter out the issue reports in line with their interest. To tackle this problem, a machine learning-based recommender system is proposed in this thesis. This recommender system can learn from the history of the issue reports that each developer worked on previously and recommend new issue reports suited to each developer. To implement this recommender system, issue reports were collected from 6 different opensource projects and different machine learning techniques were applied and compared in order to determine the most suitable one. For evaluating the performance of the recommender system, the Precision, Recall, F1-score and Mean Average Precision metrics were used. The results show that, from a backlog of 100 issue reports, by recommending the top 10 issue reports to each developer a recall ranging from 52.9% up to 96% can be achieved, which is 6 up to 9.5 times better than picking 10 issue reports randomly. Comparable improvements were also achieved in the other metrics

DSpace at Tartu University Library

Profiling Developers Through the Lens of Technical Debt

Author: Codabux Zadia
Cunningham Ward
Li Xiaozhou
Publication venue
Publication date: 08/09/2020
Field of study

Context: Technical Debt needs to be managed to avoid disastrous consequences, and investigating developers' habits concerning technical debt management is invaluable information in software development. Objective: This study aims to characterize how developers manage technical debt based on the code smells they induce and the refactorings they apply. Method: We mined a publicly-available Technical Debt dataset for Git commit information, code smells, coding violations, and refactoring activities for each developer of a selected project. Results: By combining this information, we profile developers to recognize prolific coders, highlight activities that discriminate among developer roles (reviewer, lead, architect), and estimate coding maturity and technical debt tolerance

arXiv.org e-Print Archive

Crossref

Improving Bug Triaging Using Software Analytics

Author: An Le
Publication venue
Publication date: 01/08/2015
Field of study

RÉSUMÉ La correction de bogues est une activité majeure pendant le développement et maintenance de logiciels. Durant cette activité, le tri de bogues joue un rôle essentiel. Il aide les gestionnaires à allouer leurs ressources limitées et permet aux développeurs de concentrer leurs efforts plus efficacement sur les bogues à haute sévérité. Malheureusement, les techniques du tri de bogues appliquées dans beaucoup d’entreprises ne sont pas toujours efficaces et conduisent à la misclassifications de bogues ou à des retards dans leurs résolutions, qui peuvent mener à la dégradation de la qualité d’un logiciel et à la déception de ses utilisateurs. Une stratégie de tri de bogues améliorée est nécessaire pour aider les gestionnaires à prendre de meilleures décisions, par exemple en accordant des degrés de priorité et sévérité appropriés aux bogues, ce qui permet aux développeurs de corriger les problèmes critiques le plus tôt possible en ignorant les problèmes futiles. Dans ce mémoire, nous utilisons les approches analytiques pour améliorer le tri de bogues. Nous réalisons trois études empiriques. La première étude porte sur la relation entre les corrections de bogues qui ont besoin d’autres corrections ultérieures (corrections supplémentaires) et les bogues qui ont été ouverts plus d’une fois (bogues ré-ouverts). Nous observons que les bogues ré-ouverts occupent entre 21,6% et 33,8% de toutes les corrections supplémentaires. Un grand nombre de bogues ré-ouverts (de 33,0% à 57,5%) n’ont qu’une correction préalable : les bogues originaux ont été fermés prématurément. La deuxième étude concerne les bogues qui provoquent des plantages fréquents, affectant de nombreux utilisateurs. Nous avons observé que ces bogues ne reçoivent pas toujours une attention adéquate même s’ils peuvent sérieusement dégrader la qualité d’un logiciel et même la réputation de l’entreprise. Notre troisième étude concerne les commits qui conduisent à des plantages. Nous avons trouvé que ces commits sont souvent validés par des développeurs moins expérimentés et qu’ils contiennent plus d’additions et de suppressions de lignes de code que les autre commits. Si les entreprises de logiciels pourraient détecter les problèmes susmentionnés pendant la phase du tri de bogues, elles pourraient augmenter l’efficacité de leur correction de bogues et la satisfaction de leurs utilisateurs, réduisant le coût de la maintenance de logiciels. En utilisant plusieurs algorithmes de régression et d’apprentissage automatique, nous avons bâti des modèles statistiques permettant de prédire respectivement des bogues ré-ouverts (avec une précision atteignant 97,0% et un rappel atteignant 65,3%), des bogues affectant un grand nombre d’utilisateurs (avec une précision atteignant 64,2% et un rappel atteignant 98.3%) et des commits induisant des plantages (avec une précision atteignant 61,4% et un rappel atteignant 95,0%). Les entreprises de logiciels peuvent appliquer nos modèles afin d’améliorer leur stratégie de tri de bogues, éviter les misclassifications de bogues et réduire la insatisfaction des utilisateurs due aux plantages.----------ABSTRACT Bug fixing has become a major activity in software development and maintenance. In this process, bug triaging plays an important role. It assists software managers in the allocation of their limited resources and allow developers to focus their efforts more efficiently to solve defects with high severity. Current bug triaging techniques applied in many software organisations may lead to misclassification of bugs, thus delay in bug resolution; resulting in degradation of software quality and users’ frustration. An improved bug triaging strategy would help software managers make better decisions by assigning the right priority and severity to bugs, allowing developers to address critical bugs as soon as possible and ignore the trivial ones. In this thesis, we leverage analytic approaches to conduct three empirical studies aimed at improving bug triaging techniques. The first study investigates the relation between bug fixes that need supplementary fixes and bugs that have been re-opened. We found that re-opened bugs account from 21.6% to 33.8% of all supplementary bug fixes. A considerable number of re-opened bugs (from 33.0% to 57.5%) had only one commit associated: their original bug reports were prematurely closed. The second study focuses on bugs that yield frequent crashes and impact large numbers of users. We found that these bugs were not prioritised by software managers albeit they can seriously decrease user-perceived quality and even the reputation of a software organisation. Our third study examines commits that lead to crashes. We found that these commits are often submitted by less experienced developers and that they contain more addition and deletion of lines of code than other commits. If software organisations can detect the aforementioned problems early on in the bug triaging phase, they can effectively increase their development productivity and users’ satisfaction, while decreasing software maintenance overhead. By using multiple regression and machine learning algorithms, we built statistical models to predict re-opened bugs among bugs that required supplementary bug fixes (with a precision up to 97.0% and a recall up to 65.3%), bugs with high crashing impact (with a precision up to 64.2% and a recall up to 98.3%), and commits inducing future crashes (with a precision up to 61.4% and a recall up to 95.0%). Software organisations can apply our proposed models to improve their bug triaging strategy by assigning bugs to the right developers, avoiding misclassification of bugs, reducing the negative impact of crash-related bugs, and addressing fault-prone code early on before they impact a large user base

PolyPublie

Mining software repositories for automatic software bug management from bug triaging to patch backporting

Author: TIAN Yuan
Publication venue: Singapore Management University
Publication date: 01/05/2017
Field of study

Institutional Knowledge at Singapore Management University