Search CORE

11 research outputs found

Automatic, high accuracy prediction of reopened bugs

Author: LO David
Shihab Emad
Wang Xinyu
Xia Xin
Zhou Bo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2014
Field of study

Crossref

Institutional Knowledge at Singapore Management University

SATD detector: A text-mining-based self-admitted technical debt detection tool

Author: HUANG Qiao
LI Shanping
LIU Zhongxin
LO David
SHIHAB Emad
XIA Xin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

NSFC Progra

Crossref

Institutional Knowledge at Singapore Management University

Monash University Research Portal

Towards the Use of the Readily Available Tests from the Release Pipeline as Performance Tests. Are We There Yet?

Author: Ding Zishuo
Publication venue
Publication date: 29/07/2019
Field of study

Performance is one of the important aspects of software quality. In fact, performance issues exist widely in software systems, and the process of ﬁxing the performance issues is an essential step in the release cycle of software systems. Although performance testing is widely adopted in practice, it is still expensive and time-consuming. In particular, the performance testing is usually conducted after the system is built in a dedicated testing environment. The challenge of performance testing makes it diﬃcult to ﬁt into the common DevOps process in software development. On the other hand, there exists a large number of tests readily available, that are executed regularly within the release pipeline during software development. In this paper, we perform an exploratory study to determine whether such readily available tests are capable of serving as performance tests. In particular, we would like to see whether the performance of these tests can demonstrate the performance improvements obtained from ﬁxing real-life performance issues. We collect 127 performance issues from Hadoop and Cassandra and evaluate the performance of the readily available tests from the commits before and after the performance issue ﬁxes. We ﬁnd that most of the improvements from the ﬁxes to performance issues can be demonstrated using the readily available tests in the release pipeline. However, only a very small portion of the tests can be used for demonstrating the improvements. By manually examining the tests, we identify eight reasons that a test cannot demonstrate performance improvement even though it covers the changed source code of the issue ﬁx. Finally, we build random classiﬁers determining the important metrics inﬂuencing the readily available tests (not) being able to demonstrate performance improvements from issue ﬁxes. We ﬁnd that the test code itself and the source code covered by the test are important factors, while the factors related to the code changes in the performance issues ﬁxes have low importance. Practitioners should focus on designing and improving the tests, instead of ﬁne-tuning tests for diﬀerent performance issues ﬁxes. Our ﬁndings can be used as a guideline for practitioners to reduce the amount of eﬀort spent on leveraging and designing tests that run in the release pipeline for performance assurance activities

Concordia University Research Repository

Bug characteristics in blockchain systems: A large-scale empirical study

Author: CAI Liang
LO David
WAN Zhiyuan
XIA Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2017
Field of study

NSF

Crossref

Institutional Knowledge at Singapore Management University

ELBlocker: Predicting blocking bugs with ensemble imbalance learning

Author: David LO
SHIHAB Emad
WANG Xinyu
XIA Xin
YANG Xiaohu
Publication venue: 'Elsevier BV'
Publication date: 01/05/2015
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Automating intention mining

Author: HUANG Qiao
LO David
MURPHY Gail C.
XIA Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2018
Field of study

Dataset available at https://github.com/tkdsheep/Intention-Mining-TSE</p

Institutional Knowledge at Singapore Management University

Evaluating the Effectiveness of Code2Vec for Bug Prediction When Considering That Not All Bugs Are the Same

Author: Baron Kilby
Publication venue: 'University of Waterloo'
Publication date: 14/09/2020
Field of study

Bug prediction is an area of research focused on predicting where in a software project future bugs will occur. The purpose of bug prediction models is to help companies spend their quality assurance resources more efficiently by prioritizing the testing of the most defect prone entities. Most bug prediction models are only concerned with predicting whether an entity has a bug, or how many bugs an entity will have, which implies that all bugs have the same importance. In reality, bugs can have vastly different origins, impacts, priorities, and costs; therefore, bug prediction models could potentially be improved if they were able to give an indication of which bugs to prioritize based on an organization’s needs. This paper evaluates a possible method for predicting bug attributes related to cost by analyzing over 33,000 bugs from 11 different projects. If bug attributes related to cost can be predicted, then bug prediction models can use the approach to improve the granularity of their results. The cost metrics in this study are bug priority, the experience of the developer who fixed the bug, and the size of the bug fix. First, it is shown that bugs differ along each cost metric, and prioritizing buggy entities along each of these metrics will produce very different results. We then evaluate two methods of predicting cost metrics: traditional deep learning models, and semantic learning models. The results of the analysis found evidence that traditional independent variables show potential as predictors of cost metrics. The semantic learning model was not as successful, but may show more effectiveness in future iterations

University of Waterloo's Institutional Repository

Avatud lähtekoodiga tarkvaraprojektide vearaportite ja tehniliste sõltuvuste haldamise analüüsimine

Author: Kikas Riivo
Publication venue
Publication date: 25/10/2018
Field of study

Nüüdisaegses tarkvaraarenduses kasutatakse avatud lähtekoodiga tarkvara komponente, et vähendada korratava töö hulka. Tarkvaraarendajad lisavad vaba lähtekoodiga komponente oma projektidesse, omamata ülevaadet kasutatud komponentide arendamisest ja hooldamisest. Selle töö eesmärk on analüüsida tarkvaraprojektide vearaporteid ja sõltuvuste haldamist ning arendada välja kohased meetodid. Tarkvaraprojektides kasutatakse töö organiseerimiseks veahaldussüsteeme, mille abil hallatakse tööülesandeid, vearaporteid ja uusi kasutajanõudeid. Enamat kui 4000 avatud lähtekoodiga projekti analüüsides selgus, et paljud vearaportid jäävad pikaks ajaks lahendamata. Muu hulgas võib nii ka mõni kriitiline turvaviga parandamata jääda. Doktoritöös arendatakse välja meetod, mis võimaldab automaatselt hinnata vearaporti lahendamiseks kuluvat aega. Meetod põhineb veahaldussüsteemi talletunud andmete analüüsil. Vearaporti eluaja hindamine aitab projektiosalistel prioriseerida tööülesandeid ja planeerida ressursse. Töö teises osas uuritakse, kuidas avatud lähtekoodiga projektide koodis kolmanda poole komponente kasutatakse. Tarkvaraarendajad kasutavad varem väljaarendatud komponente, et kiirendada arendust ja vähendada korratava töö hulka. Samamoodi kasutavad spetsiifilised komponendid veel omakorda teisi komponente, misläbi moodustub komponentide vaheliste seoste kaudu sõltuvuslik võrgustik. Selles doktoritöös analüüsitakse sõltuvuste võrgustikku populaarsete programmeerimiskeelte näidetel. Töö käigus arendatud meetod on rakendatav sõltuvuste võrgustiku struktuuri ja kasvu analüüsimiseks. Töös demonstreeritakse, kuidas võrgustiku struktuuri analüüsi abil saab hinnata tarkvaraprojektide riski hõlmata sõltuvusahela kaudu mõni turvaviga. Doktoritöös arendatud meetodid ja tulemused aitavad avatud lähtekoodiga projektide vearaportite ja tehniliste sõltuvuste haldamise praktikat läbipaistvamaks muuta.Modern software development relies on open-source software to facilitate reuse and reduce redundant work. Software developers use open-source packages in their projects without having insights into how these components are being developed and maintained. The aim of this thesis is to develop approaches for analyzing issue and dependency management in software projects. Software projects organize their work with issue trackers, tools for tracking issues such as development tasks, bug reports, and feature requests. By analyzing issue handling in more than 4,000 open-source projects, we found that many issues are left open for long periods of time, which can result in bugs and vulnerabilities not being fixed in a timely manner. This thesis proposes a method for predicting the amount of time it takes to resolve an issue by using the historical data available in issue trackers. Methods for predicting issue lifetime can help software project managers to prioritize issues and allocate resources accordingly. Another problem studied in this thesis is how software dependencies are used. Software developers often include third-party open-source software packages in their project code as a dependency. The included dependencies can also have their own dependencies. A complex network of dependency relationships exists among open-source software packages. This thesis analyzes the structure and the evolution of dependency networks of three popular programming languages. We propose an approach to measure the growth and the evolution of dependency networks. This thesis demonstrates that dependency network analysis can quantify what is the likelihood of acquiring vulnerabilities through software packages and how it changes over time. The approaches and findings developed here could help to bring transparency into open-source projects with respect to how issues are handled, or dependencies are updated

DSpace at Tartu University Library

Chaff from the wheat: Characterizing and determining valid bug reports

Author: FAN Yuanrui
HASSAN Ahmed E.
LO David
XIA Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2020
Field of study

Institutional Knowledge at Singapore Management University