Search CORE

11 research outputs found

Evaluating the Effectiveness of Code2Vec for Bug Prediction When Considering That Not All Bugs Are the Same

Author: Baron Kilby
Publication venue: 'University of Waterloo'
Publication date: 14/09/2020
Field of study

Bug prediction is an area of research focused on predicting where in a software project future bugs will occur. The purpose of bug prediction models is to help companies spend their quality assurance resources more efficiently by prioritizing the testing of the most defect prone entities. Most bug prediction models are only concerned with predicting whether an entity has a bug, or how many bugs an entity will have, which implies that all bugs have the same importance. In reality, bugs can have vastly different origins, impacts, priorities, and costs; therefore, bug prediction models could potentially be improved if they were able to give an indication of which bugs to prioritize based on an organization’s needs. This paper evaluates a possible method for predicting bug attributes related to cost by analyzing over 33,000 bugs from 11 different projects. If bug attributes related to cost can be predicted, then bug prediction models can use the approach to improve the granularity of their results. The cost metrics in this study are bug priority, the experience of the developer who fixed the bug, and the size of the bug fix. First, it is shown that bugs differ along each cost metric, and prioritizing buggy entities along each of these metrics will produce very different results. We then evaluate two methods of predicting cost metrics: traditional deep learning models, and semantic learning models. The results of the analysis found evidence that traditional independent variables show potential as predictors of cost metrics. The semantic learning model was not as successful, but may show more effectiveness in future iterations

University of Waterloo's Institutional Repository

Improving the Correctness of Automated Program Repair

Author: Yang Jinqiu
Publication venue: 'University of Waterloo'
Publication date: 24/04/2018
Field of study

Developers spend much of their time fixing bugs in software programs. Automated program repair (APR) techniques aim to alleviate the burden of bug fixing from developers by generating patches at the source-code level. Recently, Generate-and-Validate (G&V) APR techniques show great potential to repair general bugs in real-world applications. Recent evaluations show that G&V techniques repair 8–17.7% of the collected bugs from mature Java or C open-source projects. Despite the promising results, G&V techniques may generate many incorrect patches and are not able to repair every single bug. This thesis makes contributions to improve the correctness of APR by improving the quality assurance of the automatically-generated patches and generating more correct patches by leveraging human knowledge. First, this thesis investigates whether improving the test-suite-based validation can precisely identify incorrect patches that are generated by G&V, and whether it can help G&V generate more correct patches. The result of this investigation, Opad, which combines new fuzz-generated test cases and additional oracles (i.e., memory oracles), is proposed to identify incorrect patches and help G&V repair more bugs correctly. The evaluation of Opad shows that the improved test-suite-based validation identifies 75.2% incorrect patches from G&V techniques. With the integration of Opad, SPR, one of the most promising G&V techniques, repairs one additional bug. Second, this thesis proposes novel APR techniques to repair more bugs correctly, by leveraging human knowledge. Thus, APR techniques can repair new types of bugs that are not currently targeted by G&V APR techniques. Human knowledge in bug-fixing activities is noted in the forms such as commits of bug fixes, developers’ expertise, and documentation pages. Two techniques (APARE and Priv) are proposed to target two types of defects respectively: project-specific recurring bugs and vulnerability warnings by static analysis. APARE automatically learns fix patterns from historical bug fixes (i.e., originally crafted by developers), utilizes spectrum-based fault-localization technique to identify highly-likely faulty methods, and applies the learned fix patterns to generate patches for developers to review. The key innovation of APARE is to utilize a percentage semantic-aware matching algorithm between fix patterns and faulty locations. For the 20 recurring bugs, APARE generates 34 method fixes, 24 of which (70.6%) are correct; 83.3% (20 out of 24) are identical to the fixes generated by developers. In addition, APARE complements current repair systems by generating 20 high-quality method fixes that RSRepair and PAR cannot generate. Priv is a multi-stage remediation system specifically designed for static-analysis security-testing (SAST) techniques. The prototype is built and evaluated on a commercial SAST product. The first stage of Priv is to prioritize workloads of fixing vulnerability warnings based on shared fix locations. The likely fix locations are suggested based on a set of rules. The rules are concluded and developed through the collaboration with two security experts. The second stage of Priv provides additional essential information for improving the efficiency of diagnosis and fixing. Priv offers two types of additional information: identifying true database/attribute-related warnings, and providing customized fix suggestions per warning. The evaluation shows that Priv suggests identical fix locations to the ones suggested by developers for 50–100% of the evaluated vulnerability findings. Priv identifies up to 2170 actionable vulnerability findings for the evaluated six projects. The manual examination confirms that Priv can generate patches of high-quality for many of the evaluated vulnerability warnings

University of Waterloo's Institutional Repository

Software Maintenance At Commit-Time

Author: Nayrolles Mathieu
Publication venue
Publication date: 26/06/2018
Field of study

Software maintenance activities such as debugging and feature enhancement are known to be challenging and costly, which explains an ever growing line of research in software maintenance areas including mining software repository, default prevention, clone detection, and bug reproduction. The main goal is to improve the productivity of software developers as they undertake maintenance tasks. Existing tools, however, operate in an offline fashion, i.e., after the changes to the systems have been made. Studies have shown that software developers tend to be reluctant to use these tools as part of a continuous development process. This is because they require installation and training, hindering their integration with developers’ workflow, which in turn limits their adoption. In this thesis, we propose novel approaches to support software developers at commit-time. As part of the developer’s workflow, a commit marks the end of a given task. We show how commits can be used to catch unwanted modifications to the system, and prevent the introduction of clones and bugs, before these modifications reach the central code repository. We also propose a bug reproduction technique that is based on model checking and crash traces. Furthermore, we propose a new way for classifying bugs based on the location of fixes that can serve as the basis for future research in this field of study. The techniques proposed in this thesis have been tested on over 400 open and closed (industrial) systems, resulting in high levels of precision and recall. They are also scalable and non-intrusive

Concordia University Research Repository

Обробка повідомлень про доставку екіпажів в інформаційній мережі авіакомпанії

Author: Sydorchuk Maksym Yuriiovych
Сидорчук Максим Юрійович
Publication venue: National Аviation University
Publication date: 01/12/2020
Field of study

Робота публікується згідно наказу ректора від 29.12.2020 р. №580/од "Про розміщення кваліфікаційних робіт вищої освіти в репозиторії НАУ". Керівник проекту: к.т.н., доцент Сураєв Вадим ФедоровичIn the modern world and Ukraine, aviation ranks first in the transportation of passengers and cargo and is undoubtedly the most convenient and fastest mode of transport. The main task of aviation is to ensure the safety of passengers. Flight safety is affected by several factors: the reliability of aircraft and ground equipment, the quality of flight training, climate, the quality of the control system. Analyzing plane crashes, scientists say that the main factor is the mistakes of pilots or controllers.У сучасному світі та Україні авіація посідає перше місце у перевезенні пасажирів та вантажів і, безсумнівно, є найзручнішим та найшвидшим видом транспорту. Основним завданням авіації є забезпечення безпеки пасажирів. На безпеку польотів впливає кілька факторів: надійність літаків і наземного обладнання, якість льотної підготовки, клімат, якість системи управління. Аналізуючи авіакатастрофи, вчені кажуть, що головним фактором є помилки пілотів або контролерів

Інституційний репозитарій erNAU (Electronic Repository of the National Aviation University)

An Approach for Guiding Developers to Performance and Scalability Solutions

Author: Heger Christoph
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2018
Field of study

The quality of enterprise software applications plays a crucial role for the satisfaction of the users and the economic success of the enterprises. Software applications with unsatisfying performance and scalability are perceived by its users as low in quality, as less interesting and less attractive, and cause frustration when preventing the users from attaining their goals. This book proposes an approach for a recommendation system that enables developers who are novices in software perfor

Mining and untangling change genealogies

Author: Herzig Kim Sebastian
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2012
Field of study

Developers change source code to add new functionality, fix bugs, or refactor their code. Many of these changes have immediate impact on quality or stability. However, some impact of changes may become evident only in the long term. This thesis makes use of change genealogy dependency graphs modeling dependencies between code changes capturing how earlier changes enable and cause later ones. Using change genealogies, it is possible to: (a) applyformalmethodslikemodelcheckingonversionarchivestorevealtemporal process patterns. Such patterns encode key features of the software process and can be validated automatically: In an evaluation of four open source histories, our prototype would recommend pending activities with a precision of 60—72%. (b) classify the purpose of code changes. Analyzing the change dependencies on change genealogies shows that change genealogy network metrics can be used to automatically separate bug fixing from feature implementing code changes. (c) build competitive defect prediction models. Defect prediction models based on change genealogy network metrics show competitive prediction accuracy when compared to state-of-the-art defect prediction models. As many other approaches mining version archives, change genealogies and their applications rely on two basic assumptions: code changes are considered to be atomic and bug reports are considered to refer to corrective maintenance tasks. In a manual examination of more than 7,000 issue reports and code changes from bug databases and version control systems of open- source projects, we found 34% of all issue reports to be misclassified and that up to 15% of all applied issue fixes consist of multiple combined code changes serving multiple developer maintenance tasks. This introduces bias in bug prediction models confusing bugs and features. To partially solve these issues we present an approach to untangle such combined changes with a mean success rate of 58—90% after the fact.Softwareentwickler ändern Source-Code um neue Funktionalität hinzuzufügen, Bugs zu beheben oder um ihren Code zu restrukturieren. Viele dieser Änderungen haben einen direkten Einfluss auf Qualität und Stabilität des Softwareprodukts. Jedoch kommen einige dieser Einflüsse erst zu einem späteren Zeitpunkt zur Geltung. Diese Arbeit verwendet Genealogien zwischen Code-Änderungen um zu erfassen, wie frühere Änderungen spätere Änderungen erfordern oder ermöglichen. Die Verwendung von Änderungs-Genealogien ermöglicht: (a) die Anwendung formaler Methoden wie Model-Checking auf Versionsarchive um temporäre Prozessmuster zu erkennen. Solche Prozessmuster verdeutlichen Hauptmerkmale eines Softwareentwicklungsprozesses: In einer Evaluation auf vier Open-Source Projekten war unser Prototyp im Stande noch ausstehende Änderungen mit einer Präzision von 60–72% vorherzusagen. (b) die Absicht einer Code-Änderung zu bestimmen. Analysen von Änderungsabhängigkeiten zeigen, dass Netzwerkmetriken auf Änderungsgenealogien geeignet sind um fehlerbehebende Änderungen von Änderungen die eine Funktionalität hinzufügen zu trennen. (c) konkurrenzfähige Fehlervorhersagen zu erstellen. Fehlervorhersagen basierend auf Genealogie-Metriken können sich mit anerkannten Fehlervorhersagemodellen messen. Änderungs-Genealogien und deren Anwendungen basieren, wie andere Data-Mining Ansätze auch, auf zwei fundamentalen Annahmen: Code-Änderungen beabsichtigen die Lösung nur eines Problems und Bug-Reports weisen auf Fehler korrigierende Tätigkeiten hin. Eine manuelle Inspektion von mehr als 7.000 Issue-Reports und Code-Änderungen hat ergeben, dass 34% aller Issue-Reports falsch klassifiziert sind und dass bis zu 15% aller fehlerbehebender Änderungen mehr als nur einem Entwicklungs-Task dienen. Dies wirkt sich negativ auf Vorhersagemodelle aus, die nicht mehr klar zwischen Bug-Fixes und anderen Änderungen unterscheiden können. Als Lösungsansatz stellen wir einen Algorithmus vor, der solche nicht eindeutigen Änderungen mit einer Erfolgsrate von 58–90% entwirrt

Automated software systems generation for process-oriented organizations

Author: Duarte Francisco J.
Publication venue
Publication date: 06/10/2014
Field of study

Tese de doutoramento do Programa Doutoral em Tecnologias e Sistemas de InformaçãoCada vez mais, as organizações suportam as suas operações em sistemas de software. Torna-se, portanto, muito relevante o correto mapeamento das operações nos sistemas de software. Esta tese foca-se em organizações orientadas a processos de negócio, devido à relevância dada pelas normas de qualidade, pelos modelos de excelência, e pelos requisitos dos clientes, a esse tipo de estruturação interna das organizações. Nas organizações orientadas a processos de negócio existem diversos fatores, como o tempo envolvido nos projetos de implementação de processos de negócio em software, as diferenças existentes entre os modelos de processos de negócio e a sua implementação real, ou a quantidade e o tipo de recursos envolvidos nesses projetos, que fazem com que os projetos de desenvolvimento de software sejam demasiado dispendiosos, demorem demasiado tempo, e não garantam que o produto de software resultante seja o mais adequado à realidade da organização que o vai usar. Esta tese propõe que os sistemas de informação e de software devam ser desenvolvidos, desde o início, incorporando os modelos das organizações onde irão ser usados. Além disso, e como existem disponíveis modelos de referência de processos de negócio, esta tese também propõe o seu uso explícito aquando da recolha de requisitos. Assim, o objetivo principal da tese é propor uma metodologia que se inicie com modelos de processos de negócio e que termine com a geração de sistemas de software, para organizações orientadas a processos de negócio. A metodologia denomina-se BIM e é formalizada através do metamodelo EPF. Dada a abrangência dos temas a tratar, a tese foi conduzida tendo em atenção que o processo de desenvolvimento de software para suportar organizações orientadas a processos pode ser otimizado. Para melhor mostrar os diversos passos e resultados intermédios, usamos a metodologia de investigação Action Research. A tese propõe que as atividades de investigação sejam terminadas quando uma dada condição de paragem seja atingida, e para isso usa uma avaliação baseada num conjunto de indicadores para os resultados do produto e do processo, e uma adaptação do modelo de excelência EFQM para a forma como foi executado o processo de desenvolvimento. O foco das Action são os sistemas de software MES, essenciais na ligação entre sistemas de software embebido e sistemas ERP. Nesta tese, as Action iniciam-se com modelos de processos e com arquiteturas de software standard, e terminam com uma proposta de modelo de processo e com arquiteturas de software e tecnologias adaptadas à execução de processos de negócio. A tese propõe também alguns conceitos como IAvO (extensão de modelos de processos de negócio), OBO (componentes de software intermutáveis e não-proprietários), OA (aspetos organizacionais), e PF (framework de processos) para aumentarem a eficiência e eficácia na implementação em software de processos de negócio.Increasingly, organizations support their operations by using software systems, turning very relevant the proper mapping of operations into software systems. This thesis focuses on organizations oriented to business processes, due to the importance that quality norms, excellence models, and customer requirements put on this type of internal structures of organizations. Process-oriented organizations have characteristics, such as the time needed to implement business processes in software, the differences between the business process models and the real business processes, or the quantity and type of the required resources, that lead to development projects too expensive, taking too long to complete, and that do not assure that the resulting product is the most adequate to the reality of the client organization. This thesis proposes that the development of information and software system embodies, since the early stages, the models of the organization where they operate. In addition, and since business process reference models are available, the thesis also proposes to use explicitly such reference models by requirements collection time. Thus, the main goal of the thesis is to propose a methodology that picks business process reference models and ends with software systems, for process-oriented organizations. The methodology is denominated BIM and is formalized by using the EPF metamodel. Due to the wide scope of the studied areas, the thesis is tailored considering that the development process for processoriented organizations can be optimized. To express better the intermediate steps and results, we use the Action Research methodology. The thesis proposes that the research activities terminate when a stopping condition is met, based on a set of indicators for the product, and a tailoring of the EFQM model for the development process. The Actions are focused on MES, crucial for the linking of embedded software systems with ERP systems. In this thesis, the Actions start by using standard process models and software architectures, and end by using a proposed process model, and software architectures and technologies adapted to the execution of business software. The thesis also proposes new concepts like IAvO (extension to business process reference models), OBO (interchangeable and nonproprietary software components), AO (organizational aspects), and PF (process framework) to increase the efficiency and the effectiveness of the implementation of business processes in software