5 research outputs found

    Effective Segmentation of Large Execution Traces Using Probabilistic and Gaussian Mixture Models

    Get PDF
    Software maintenance is known to be a costly and time consuming activity. Software engineers need to spend a considerable amount of time in understanding the system before maintaining it. This is due to many reasons including the lack of good documentation and the shift of the original developers of the system to other projects or companies. Dynamic analysis techniques, more particularly trace analysis, are used to alleviate the program comprehension problem by offering software engineers a set of techniques that can help them understand the behavioural aspects of software systems. Execution traces however can be extremely large, which makes them cumbersome for effective analysis. There is a need to develop techniques to help software engineers understand the content of large traces despite their massive size. In this thesis, we present, SumTrace, a novel trace analysis technique. SumTrace takes a trace as input and automatically segments it into smaller and more manageable groups that reflect the execution phases of the traced scenario. The execution phases are summarized to help software engineers understand quickly different parts of the trace without having to analyze its entire content. SumTrace relies on a combination of probabilistic and Gaussian mixture models. We applied SumTrace to the segmentation of large traces, generated from two software systems. The results are very promising. SumTrace is also fast since it only requires only one pass through a trace

    Phase Flow Diagram: A New Execution Trace Visualization Technique

    Get PDF
    Software maintenance tasks are known to be costly and challenging. The main challenge is that software maintenance must understand how the software system works before making any changes to it. This is due to lack of adequate documentation if it exists at all. Program analysis techniques aim to reduce the impact of this problem. In this thesis, we focus on the ones that permit the understanding of the behavioural aspects of software. These techniques operate on execution traces, generated from the system under study. Traces are difficult to work with because of their size. One way to reduce their complexity is to automatically divide their content into meaningful clusters, each representing a particular execution phase. This is known as trace segmentation. Trace segmentation research is relatively new. The focus has been on building robust algorithms that achieve acceptable accuracy. In this thesis, we introduce a new trace visualization technique called Phase Flow Diagram to represent the execution phases and the relationship between them in a visual manner. The diagram has a number of notations that can be used by software engineers to represent a trace as a flow of execution phases instead of mere events. We introduce a supporting tool for the diagram. The new diagram and the tool are validated through a user study that involves several users

    Automatic bug triaging techniques using machine learning and stack traces

    Get PDF
    When a software system crashes, users have the option to report the crash using automated bug tracking systems. These tools capture software crash and failure data (e.g., stack traces, memory dumps, etc.) from end-users. These data are sent in the form of bug (crash) reports to the software development teams to uncover the causes of the crash and provide adequate fixes. The reports are first assessed (usually in a semi-automatic way) by a group of software analysts, known as triagers. Triagers assign priority to the bugs and redirect them to the software development teams in order to provide fixes. The triaging process, however, is usually very challenging. The problem is that many of these reports are caused by similar faults. Studies have shown that one way to improve the bug triaging process is to detect automatically duplicate (or similar) reports. This way, triagers would not need to spend time on reports caused by faults that have already been handled. Another issue is related to the prioritization of bug reports. Triagers often rely on the information provided by the customers (the report submitters) to prioritize bug reports. However, this task can be quite tedious and requires tool support. Next, triagers route the bug report to the responsible development team based on the subsystem, which caused the crash. Since having knowledge of all the subsystems of an ever-evolving industrial system is impractical, having a tool to automatically identify defective subsystems can significantly reduce the manual bug triaging effort. The main goal of this research is to investigate techniques and tools to help triagers process bug reports. We start by studying the effect of the presence of stack traces in analyzing bug reports. Next, we present a framework to help triagers in each step of the bug triaging process. We propose a new and scalable method to automatically detect duplicate bug reports using stack traces and bug report categorical features. We then propose a novel approach for predicting bug severity using stack traces and categorical features, and finally, we discuss a new method for predicting faulty product and component fields of bug reports. We evaluate the effectiveness of our techniques using bug reports from two large open-source systems. Our results show that stack traces and machine learning methods can be used to automate the bug triaging process, and hence increase the productivity of bug triagers, while reducing costs and efforts associated with manual triaging of bug reports

    Trace Abstraction Framework and Techniques

    Get PDF
    Understanding the behavioural aspects of software systems can help in a variety of software engineering tasks such as debugging, feature enhancement, performance analysis, and security. Software behaviour is typically represented in the form of execution traces. Traces, however, have historically been difficult to analyze due to the overwhelming size of typical traces. Trace analysis, more particularly trace abstraction and simplification, techniques have emerged to overcome the challenges of working with large traces. Existing traces analysis tools rely on some sort of visualization techniques to help software engineers make sense of trace content. Many of these techniques have been studied and found to be limited in many ways. In this thesis, we present a novel approach for trace analysis inspired by the way the human brain and perception systems operate. The idea is to mimic the psychological processes that have been developed over the years to explain how our perception system deals with huge volume of visual data. We show how similar mechanisms can be applied to the abstraction and simplification of large traces. As part of this framework, we present a novel trace analysis technique that automatically divides the content of a large trace, generated from execution of a target system, into meaningful segments that correspond to the system’s main execution phases such as initializing variables, performing a specific computation, etc. We also propose a trace sampling technique that not only reduces the size of a trace but also results in a sampled trace that is representative of the original trace by ensuring that the desired characteristics of an execution are distributed similarly in both the sampled and the original trace. Our approach is based on stratified sampling and uses the concept of execution phases as strata. Finally, we propose an approach to automatically identify the most relevant trace components of each execution phases. This approach also enables an efficient representation of the flow of phases by detecting redundant phases using a cosine similarity metric. The techniques presented in this thesis have been validated by applying to a variety of target systems. The obtained results demonstrate the effectiveness and usefulness of our methods
    corecore