272 research outputs found

    On Improving (Non)Functional Testing

    Get PDF
    Software testing is commonly classified into two categories, nonfunctional testing and functional testing. The goal of nonfunctional testing is to test nonfunctional requirements, such as performance and reliability. Performance testing is one of the most important types of nonfunctional testing, one goal of which is to detect the phenomena that an Application Under Testing (AUT) exhibits unexpectedly worse performance (e.g., lower throughput) with some input data. During performance testing, a critical challenge is to understand the AUT’s behaviors with large numbers of combinations of input data and find the particular subset of inputs leading to performance bottlenecks. However, enumerating those particular inputs and identifying those bottlenecks are always laborious and intellectually intensive. In addition, for an evolving software system, some code changes may accidentally degrade performance between two software versions, it is even more challenging to find problematic changes (out of a large number of committed changes) may lead to performance regressions under certain test inputs. This dissertation presents a set of approaches to automatically find specific combinations of input data for exposing performance bottlenecks and further analyze execution traces to identify performance bottlenecks. In addition, this dissertation also provides an approach that automatically estimates the impact of code changes on performance degradation between two released software versions to identify the problematic ones likely leading to performance regressions. Functional testing is used to test the functional correctness of AUTs. Developers commonly write test suites for AUTs to test different functionalities and locate functional faults. During functional testing, developers rely on some strategies to order test cases to achieve certain objectives, such as exposing faults faster, which is known as Test Case Prioritization (TCP). TCP techniques are commonly classified into two categories, dynamic and static techniques. A set of empirical studies has been conducted to examine and understand different TCP techniques, but there is a clear gap in existing studies. No study has compared static techniques against dynamic techniques and comprehensively examined the impact of test granularity, program size, fault characteristics, and the similarities in terms of fault detection on TCP techniques. Thus, this dissertation presents an empirical study to thoroughly compare static and dynamic TCP techniques in terms of effectiveness, efficiency, and similarity of uncovered faults at different granularities on a large set of real-world programs, and further analyze the potential impact of program size and fault characteristics on TCP evaluation. Moreover, in the prior work, TCP techniques have been typically evaluated against synthetic software defects, called mutants. For this reason, it is currently unclear whether TCP performance on mutants would be representative of the performance achieved on real faults. to answer this fundamental question, this dissertation presents the first empirical study that investigates TCP performance when applied to both real-world faults and mutation faults for understanding the representativeness of mutants

    A Survey of Learning-based Automated Program Repair

    Full text link
    Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance. With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories. Such learning-based techniques usually treat APR as a neural machine translation (NMT) task, where buggy code snippets (i.e., source language) are translated into fixed code snippets (i.e., target language) automatically. Benefiting from the powerful capability of DL to learn hidden relationships from previous bug-fixing datasets, learning-based APR techniques have achieved remarkable performance. In this paper, we provide a systematic survey to summarize the current state-of-the-art research in the learning-based APR community. We illustrate the general workflow of learning-based APR techniques and detail the crucial components, including fault localization, patch generation, patch ranking, patch validation, and patch correctness phases. We then discuss the widely-adopted datasets and evaluation metrics and outline existing empirical studies. We discuss several critical aspects of learning-based APR techniques, such as repair domains, industrial deployment, and the open science issue. We highlight several practical guidelines on applying DL techniques for future APR studies, such as exploring explainable patch generation and utilizing code features. Overall, our paper can help researchers gain a comprehensive understanding about the achievements of the existing learning-based APR techniques and promote the practical application of these techniques. Our artifacts are publicly available at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}

    How Evolutionary Visual Software Analytics Supports Knowledge Discovery

    Get PDF
    [EN] Evolutionary visual software analytics is a specialization of visual analytics. It is aimed at supporting software maintenance processes by aiding the understanding and comprehension of software evolution with the active participation of users. Therefore, it deals with the analysis of software projects that have been under development and maintenance for several years and which are usually formed by thousands of software artifacts,which are also associated to logs from communications, defect-tracking and software configuration management systems. Accordingly, evolutionary visual software analytics aims to assist software developers and software project managers by means of an integral approach that takes into account knowledge extraction techniques as well as visual representations that make use of interaction techniques and linked views. Consequently,this paper discusses the implementation of an architecture based on the evolutionary visual software analytics process and how it supports knowledge discovery during software maintenance tasks.[ES] Analítica de software visual evolutivos es una especialización de la analítica visual. Está dirigido a apoyar los procesos de mantenimiento de software, ayudando al entendimiento y la comprensión de la evolución del software, con la participación activa de los usuarios. Por lo tanto, tiene que ver con el análisis de los proyectos de software que han estado bajo desarrollo y mantenimiento por varios años y que por lo general están formados por miles de artefactos de software, que también están asociadas a los registros de las comunicaciones, seguimiento de defectos y sistemas de gestión de configuración de software. En consecuencia, la analítica de software visual evolutivos tiene como objetivo ayudar a los desarrolladores de software y administradores de proyectos de software a través de un enfoque integral que tenga en cuenta las técnicas de extracción de conocimiento, así como representaciones visuales que hacen uso de técnicas de interacción y vistas enlazadas. En consecuencia, en este documento se analiza la implementación de una arquitectura basada en el proceso de analítica de software visual evolutivos y la forma en que apoya el descubrimiento de conocimiento durante las tareas de mantenimiento de softwar

    Workflow Provenance: from Modeling to Reporting

    Get PDF
    Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, error tracking, workflow monitoring, usage pattern discovery, and so on. Integrating provenance into a workflow system or modifying a workflow system to capture or analyze different provenance information is burdensome, requiring extensive development because provenance mechanisms rely heavily on the modelling, architecture, and design of the workflow system. Various tools and technologies exist for logging events in a software system. Unfortunately, logging tools and technologies are not designed for capturing and analyzing provenance information. Workflow provenance is not only about logging, but also about retrieving workflow related information from logs. In this work, we propose a taxonomy of provenance questions and guided by these questions, we created a workflow programming model 'ProvMod' with a supporting run-time library to provide automated provenance and log analysis for any workflow system. The design and provenance mechanism of ProvMod is based on recommendations from prominent research and is easy to integrate into any workflow system. ProvMod offers Neo4j graph database support to manage semi-structured heterogeneous JSON logs. The log structure is adaptable to any NoSQL technology. For each provenance question in our taxonomy, ProvMod provides the answer with data visualization using Neo4j and the ELK Stack. Besides analyzing performance from various angles, we demonstrate the ease of integration by integrating ProvMod with Apache Taverna and evaluate ProvMod usability by engaging users. Finally, we present two Software Engineering research cases (clone detection and architecture extraction) where our proposed model ProvMod and provenance questions taxonomy can be applied to discover meaningful insights

    A Planning Approach to Migrating Domain-specific Legacy Systems into Service Oriented Architecture

    Get PDF
    The planning work prior to implementing an SOA migration project is very important for its success. Up to now, most of this kind of work has been manual work. An SOA migration planning approach based on intelligent information processing methods is addressed to semi-automate the manual work. This thesis will investigate the principle research question: “How can we obtain SOA migration planning schemas (semi-) automatically instead of by traditional manual work in order to determine if legacy software systems should be migrated to SOA computation environment?”. The controlled experiment research method has been adopted for directing research throughout the whole thesis. Data mining methods are used to analyse SOA migration source and migration targets. The mined information will be the supplementation of traditional analysis results. Text similarity measurement methods are used to measure the matching relationship between migration sources and migration targets. It implements the quantitative analysis of matching relationships instead of common qualitative analysis. Concretely, an association rule and sequence pattern mining algorithms are proposed to analyse legacy assets and domain logics for establishing a Service model and a Component model. These two algorithms can mine all motifs with any min-support number without assuming any ordering. It is better than the existing algorithms for establishing Service models and Component models in SOA migration situations. Two matching strategies based on keyword level and superficial semantic levels are described, which can calculate the degree of similarity between legacy components and domain services effectively. Two decision-making methods based on similarity matrix and hybrid information are investigated, which are for creating SOA migration planning schemas. Finally a simple evaluation method is depicted. Two case studies on migrating e-learning legacy systems to SOA have been explored. The results show the proposed approach is encouraging and applicable. Therefore, the SOA migration planning schemas can be created semi-automatically instead of by traditional manual work by using data mining and text similarity measurement methods

    Privacy-preserving human mobility and activity modelling

    Get PDF
    The exponential proliferation of digital trends and worldwide responses to the COVID-19 pandemic thrust the world into digitalization and interconnectedness, pushing increasingly new technologies/devices/applications into the market. More and more intimate data of users are collected for positive analysis purposes of improving living well-being but shared with/without the user's consent, emphasizing the importance of making human mobility and activity models inclusive, private, and fair. In this thesis, I develop and implement advanced methods/algorithms to model human mobility and activity in terms of temporal-context dynamics, multi-occupancy impacts, privacy protection, and fair analysis. The following research questions have been thoroughly investigated: i) whether the temporal information integrated into the deep learning networks can improve the prediction accuracy in both predicting the next activity and its timing; ii) how is the trade-off between cost and performance when optimizing the sensor network for multiple-occupancy smart homes; iii) whether the malicious purposes such as user re-identification in human mobility modelling could be mitigated by adversarial learning; iv) whether the fairness implications of mobility models and whether privacy-preserving techniques perform equally for different groups of users. To answer these research questions, I develop different architectures to model human activity and mobility. I first clarify the temporal-context dynamics in human activity modelling and achieve better prediction accuracy by appropriately using the temporal information. I then design a framework MoSen to simulate the interaction dynamics among residents and intelligent environments and generate an effective sensor network strategy. To relieve users' privacy concerns, I design Mo-PAE and show that the privacy of mobility traces attains decent protection at the marginal utility cost. Last but not least, I investigate the relations between fairness and privacy and conclude that while the privacy-aware model guarantees group fairness, it violates the individual fairness criteria.Open Acces

    Comparative process mining:analyzing variability in process data

    Get PDF

    Comparative process mining:analyzing variability in process data

    Get PDF

    Multiple-Aspect Analysis of Semantic Trajectories

    Get PDF
    This open access book constitutes the refereed post-conference proceedings of the First International Workshop on Multiple-Aspect Analysis of Semantic Trajectories, MASTER 2019, held in conjunction with the 19th European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2019, in Würzburg, Germany, in September 2019. The 8 full papers presented were carefully reviewed and selected from 12 submissions. They represent an interesting mix of techniques to solve recurrent as well as new problems in the semantic trajectory domain, such as data representation models, data management systems, machine learning approaches for anomaly detection, and common pathways identification
    corecore