42 research outputs found

    Exploiting Parts-of-Speech for Effective Automated Requirements Traceability

    Get PDF
    Context: Requirement traceability (RT) is defined as the ability to describe and follow the life of a requirement. RT helps developers ensure that relevant requirements are implemented and that the source code is consistent with its requirement with respect to a set of traceability links called trace links. Previous work leverages Parts Of Speech (POS) tagging of software artifacts to recover trace links among them. These studies work on the premise that discarding one or more POS tags results in an improved accuracy of Information Retrieval (IR) techniques. Objective: First, we show empirically that excluding one or more POS tags could negatively impact the accuracy of existing IR-based traceability approaches, namely the Vector Space Model (VSM) and the Jensen Shannon Model (JSM). Second, we propose a method that improves the accuracy of IR-based traceability approaches. Method: We developed an approach, called ConPOS, to recover trace links using constraint-based pruning. ConPOS uses major POS categories and applies constraints to the recovered trace links for pruning as a filtering process to significantly improve the effectiveness of IR-based techniques. We conducted an experiment to provide evidence that removing POSs does not improve the accuracy of IR techniques. Furthermore, we conducted two empirical studies to evaluate the effectiveness of ConPOS in recovering trace links compared to existing peer RT approaches. Results: The results of the first empirical study show that removing one or more POS negatively impacts the accuracy of VSM and JSM. Furthermore, the results from the other empirical studies show that ConPOS provides 11%-107%, 8%-64%, and 15%-170% higher precision, recall, and mean average precision (MAP) than VSM and JSM. Conclusion: We showed that ConPosout performs existing IR-based RT approaches that discard some POS tags from the input documents

    System and Application Performance Analysis Patterns Using Software Tracing

    Get PDF
    Software systems have become increasingly complex, which makes it difficult to detect the root causes of performance degradation. Software tracing has been used extensively to analyze the system at run-time to detect performance issues and uncover the causes. There exist several studies that use tracing and other dynamic analysis techniques for performance analysis. These studies focus on specific system characteristics such as latency, performance bugs, etc. In this thesis, we review the literature to build a catalogue of performance analysis patterns that can be detected using trace data. The goal is to help developers debug run-time and performance issues more efficiently. The patterns are formalized and implemented so that they can be readily referred to by developers while analyzing large execution traces. The thesis focuses on the traces of system calls generated by the Linux kernel. This is because no application is an island and that we cannot ignore the complex interactions that an application has with the operating system kernel if we are to detect potential performance issues

    The effects of education on students' perception of modeling in software engineering

    Get PDF
    Models in software engineering bring significant potential in improvements of productivity of engineers, and improved quality of the artifacts they produce. Despite this significant potential, modeling adoption in practice remains rather low. Computer Science and software engineering curriculums may be one factor that causes this low adoption. In this study, we investigate the effects of education on students’ perception of modeling. We conducted a survey in three separate institutions, in Canada, Israel, and the U.S. The survey covers various aspects of modeling and addresses students ranging from a first year in undergraduate studies until final years in graduate studies. The survey’s findings suggest that the perception of undergraduate students towards modeling declines as they progress in their studies. While graduate students tend to be more favorable of modeling, their perception also declines over the years. The results also suggest that students prefer more modeling content to be integrated earlier in the curriculum

    EnHMM: On the Use of Ensemble HMMs and Stack Traces to Predict the Reassignment of Bug Report Fields

    Full text link
    Bug reports (BR) contain vital information that can help triaging teams prioritize and assign bugs to developers who will provide the fixes. However, studies have shown that BR fields often contain incorrect information that need to be reassigned, which delays the bug fixing process. There exist approaches for predicting whether a BR field should be reassigned or not. These studies use mainly BR descriptions and traditional machine learning algorithms (SVM, KNN, etc.). As such, they do not fully benefit from the sequential order of information in BR data, such as function call sequences in BR stack traces, which may be valuable for improving the prediction accuracy. In this paper, we propose a novel approach, called EnHMM, for predicting the reassignment of BR fields using ensemble Hidden Markov Models (HMMs), trained on stack traces. EnHMM leverages the natural ability of HMMs to represent sequential data to model the temporal order of function calls in BR stack traces. When applied to Eclipse and Gnome BR repositories, EnHMM achieves an average precision, recall, and F-measure of 54%, 76%, and 60% on Eclipse dataset and 41%, 69%, and 51% on Gnome dataset. We also found that EnHMM improves over the best single HMM by 36% for Eclipse and 76% for Gnome. Finally, when comparing EnHMM to Im.ML.KNN, a recent approach in the field, we found that the average F-measure score of EnHMM improves the average F-measure of Im.ML.KNN by 6.80% and improves the average recall of Im.ML.KNN by 36.09%. However, the average precision of EnHMM is lower than that of Im.ML.KNN (53.93% as opposed to 56.71%).Comment: Published in Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021), 11 pages, 7 figure

    Techniques to Simplify the Analysis of Execution Traces for Program Comprehension

    No full text
    Understanding a large execution trace is not easy task due to the size and complexity of typical traces. In this thesis, we present various techniques that tackle this problem. Firstly, we present a set of metrics for measuring various properties of an execution trace in order to assess the work required for understanding its content. We show the result of applying these metrics to thirty traces generated from three different software systems. We discuss how these metrics can be supported by tools to facilitate the exploration of traces based on their complexity. Secondly, we present a novel technique for manipulating traces called trace summarization, which consists of taking a trace as input and return a summary of its main content as output. Traces summaries can be used to enable top-down analysis of traces as well as the recovery of the system behavioural models. In this thesis, we present a trace summarization algorithm that is based on successive filtering of implementation details from traces. An analysis of the concept of implementation details such as utilities is also presented. Thirdly, we have developed a scalable exchange format called the Compact Trace Format (CTF) in order to enable sharing and reusing of traces. The design of CTF satisfies wellknown requirements for a standard exchange format. Finally, this thesis includes a survey of eight trace analysis tools. A study of the advantages and limitations of the techniques supported by these tools is provided. The approaches presented in this thesis have been applied to real software systems. The obtained results demonstrate the effectiveness and usefulness of our techniques.

    Techniques for Reducing the Complexity of Object-Oriented Execution Traces

    No full text
    Understanding the behavior of object-oriented systems is almost impossible by merely performing static analysis of the source code. Dynamic analysis approaches are better suited for this purpose. Run time information is typically represented in the form of execution traces that contain object interactions. However, traces can be very large and hard to comprehend. Visualization tools need to implement efficient filtering techniques to remove unnecessary data and present only information that adds value to the comprehension process. This paper addresses this issue by presenting different filtering techniques. These techniques are based on removing utility methods and the use of object-oriented concepts such as polymorphism and inheritance to hide low-level implementation details. We also experiment with 12 execution traces of an object-oriented system called WEKA and study the gain attained by these filtering technique

    Workshop on Program Comprehension through Dynamic Analysis

    No full text
    corecore