1,257 research outputs found

    Input Prioritization for Testing Neural Networks

    Full text link
    Deep neural networks (DNNs) are increasingly being adopted for sensing and control functions in a variety of safety and mission-critical systems such as self-driving cars, autonomous air vehicles, medical diagnostics, and industrial robotics. Failures of such systems can lead to loss of life or property, which necessitates stringent verification and validation for providing high assurance. Though formal verification approaches are being investigated, testing remains the primary technique for assessing the dependability of such systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining test oracle data---the expected output, a.k.a. label, for a given input---is high, which significantly impacts the amount and quality of testing that can be performed. Thus, prioritizing input data for testing DNNs in meaningful ways to reduce the cost of labeling can go a long way in increasing testing efficacy. This paper proposes using gauges of the DNN's sentiment derived from the computation performed by the model, as a means to identify inputs that are likely to reveal weaknesses. We empirically assessed the efficacy of three such sentiment measures for prioritization---confidence, uncertainty, and surprise---and compare their effectiveness in terms of their fault-revealing capability and retraining effectiveness. The results indicate that sentiment measures can effectively flag inputs that expose unacceptable DNN behavior. For MNIST models, the average percentage of inputs correctly flagged ranged from 88% to 94.8%

    International conference on software engineering and knowledge engineering: Session chair

    Get PDF
    The Thirtieth International Conference on Software Engineering and Knowledge Engineering (SEKE 2018) will be held at the Hotel Pullman, San Francisco Bay, USA, from July 1 to July 3, 2018. SEKE2018 will also be dedicated in memory of Professor Lofti Zadeh, a great scholar, pioneer and leader in fuzzy sets theory and soft computing. The conference aims at bringing together experts in software engineering and knowledge engineering to discuss on relevant results in either software engineering or knowledge engineering or both. Special emphasis will be put on the transference of methods between both domains. The theme this year is soft computing in software engineering & knowledge engineering. Submission of papers and demos are both welcome

    IDENTIFYING A CUSTOMER CENTERED APPROACH FOR URBAN PLANNING: DEFINING A FRAMEWORK AND EVALUATING POTENTIAL IN A LIVABILITY CONTEXT

    Get PDF
    In transportation planning, public engagement is an essential requirement forinformed decision-making. This is especially true for assessing abstract concepts such aslivability, where it is challenging to define objective measures and to obtain input that canbe used to gauge performance of communities. This dissertation focuses on advancing adata-driven decision-making approach for the transportation planning domain in thecontext of livability. First, a conceptual model for a customer-centric framework fortransportation planning is designed integrating insight from multiple disciplines (chapter1), then a data-mining approach to extracting features important for defining customersatisfaction in a livability context is described (chapter 2), and finally an appraisal of thepotential of social media review mining for enhancing understanding of livability measuresand increasing engagement in the planning process is undertaken (chapter 3). The resultsof this work also include a sentiment analysis and visualization package for interpreting anautomated user-defined translation of qualitative measures of livability. The packageevaluates users satisfaction of neighborhoods through social media and enhances thetraditional approaches to defining livability planning measures. This approach has thepotential to capitalize on residents interests in social media outlets and to increase publicengagement in the planning process by encouraging users to participate in onlineneighborhood satisfaction reporting. The results inform future work for deploying acomprehensive approach to planning that draws the marketing structure of transportationnetwork products with residential nodes as the center of the structure

    Understanding the Impact of Diversity in Software Bugs on Bug Prediction Models

    Get PDF
    Nowadays, software systems are essential for businesses, users and society. At the same time such systems are growing both in complexity and size. In this context, developing high-quality software is a challenging and expensive activity for the software industry. Since software organizations are always limited by their budget, personnel and time, it is not a trivial task to allocate testing and code-review resources to areas that require the most attention. To overcome the above problem, researchers have developed software bug prediction models that can help practitioners to predict the most bug-prone software entities. Although, software bug prediction is a very popular research area, yet its industrial adoption remains limited. In this thesis, we investigate three possible issues with the current state-of-the-art in software bug prediction that affect the practical usability of prediction models. First, we argue that current bug prediction models implicitly assume that all bugs are the same without taking into consideration their impact. We study the impact of bugs in terms of experience of the developers required to fix them. Second, only few studies investigate the impact of specific type of bugs. Therefore, we characterize a severe type of bug called Blocking bugs, and provide approaches to predict them early on. Third, false-negative files are buggy files that bug prediction models incorrectly as non-buggy files. We argue that a large number of false-negative files makes bug prediction models less attractive for developers. In our thesis, we quantify the extent of false-negative files, and manually inspect them in order to better understand their nature

    Towards the Use of the Readily Available Tests from the Release Pipeline as Performance Tests. Are We There Yet?

    Get PDF
    Performance is one of the important aspects of software quality. In fact, performance issues exist widely in software systems, and the process of fixing the performance issues is an essential step in the release cycle of software systems. Although performance testing is widely adopted in practice, it is still expensive and time-consuming. In particular, the performance testing is usually conducted after the system is built in a dedicated testing environment. The challenge of performance testing makes it difficult to fit into the common DevOps process in software development. On the other hand, there exists a large number of tests readily available, that are executed regularly within the release pipeline during software development. In this paper, we perform an exploratory study to determine whether such readily available tests are capable of serving as performance tests. In particular, we would like to see whether the performance of these tests can demonstrate the performance improvements obtained from fixing real-life performance issues. We collect 127 performance issues from Hadoop and Cassandra and evaluate the performance of the readily available tests from the commits before and after the performance issue fixes. We find that most of the improvements from the fixes to performance issues can be demonstrated using the readily available tests in the release pipeline. However, only a very small portion of the tests can be used for demonstrating the improvements. By manually examining the tests, we identify eight reasons that a test cannot demonstrate performance improvement even though it covers the changed source code of the issue fix. Finally, we build random classifiers determining the important metrics influencing the readily available tests (not) being able to demonstrate performance improvements from issue fixes. We find that the test code itself and the source code covered by the test are important factors, while the factors related to the code changes in the performance issues fixes have low importance. Practitioners should focus on designing and improving the tests, instead of fine-tuning tests for different performance issues fixes. Our findings can be used as a guideline for practitioners to reduce the amount of effort spent on leveraging and designing tests that run in the release pipeline for performance assurance activities

    Selecting fault revealing mutants

    Get PDF
    Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings)

    Enhanced Approach for Bug Severity Prediction: Experimentation and Scope for Improvements

    Get PDF
    Software development is an iterative process, where developers create, test, and refine their code until it is ready for release. Along the way, bugs and issues are inevitable. A bug can be any error identified in requirement specification, design or implementation of any project. These bugs need to be categorized and assigned to developers to be resolved. the number of bugs generated in any large scale project are vast in number. These bugs can have significant or no impact on the project depending on the type of bug. The aim of this study is to develop a deep learning-based bug severity prediction model that can accurately predict the severity levels of software bugs. This study aims to address the limitations of the current manual bug severity assessment process and provide an automated solution using various classifiers e.g. Naïve Bayes, Logistic regression, KNN and Support vector machine along with Mutual information as feature selection method, that can assist software development teams in giving severity code to bugs effectively. It seeks to improve the overall software development process by reducing the time and effort required for bug resolution and enhancing the quality and reliability of software

    Selecting fault revealing mutants

    Get PDF
    Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings)
    corecore