4,324 research outputs found

    Mining Sequences of Developer Interactions in Visual Studio for Usage Smells

    Get PDF
    In this paper, we present a semi-automatic approach for mining a large-scale dataset of IDE interactions to extract usage smells, i.e., inefficient IDE usage patterns exhibited by developers in the field. The approach outlined in this paper first mines frequent IDE usage patterns, filtered via a set of thresholds and by the authors, that are subsequently supported (or disputed) using a developer survey, in order to form usage smells. In contrast with conventional mining of IDE usage data, our approach identifies time-ordered sequences of developer actions that are exhibited by many developers in the field. This pattern mining workflow is resilient to the ample noise present in IDE datasets due to the mix of actions and events that these datasets typically contain. We identify usage patterns and smells that contribute to the understanding of the usability of Visual Studio for debugging, code search, and active file navigation, and, more broadly, to the understanding of developer behavior during these software development activities. Among our findings is the discovery that developers are reluctant to use conditional breakpoints when debugging, due to perceived IDE performance problems as well as due to the lack of error checking in specifying the conditional

    BlogForever: D3.1 Preservation Strategy Report

    Get PDF
    This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

    What's next? : operational support for business process execution

    Get PDF
    In the last decade flexibility has become an increasingly important in the area of business process management. Information systems that support the execution of the process are required to work in a dynamic environment that imposes changing demands on the execution of the process. In academia and industry a variety of paradigms and implementations has been developed to support flexibility. While on the one hand these approaches address the industry demands in flexibility, on the other hand, they result in confronting the user with many choices between different alternatives. As a consequence, methods to support users in selecting the best alternative during execution have become essential. In this thesis we introduce a formal framework for providing support to users based on historical evidence available in the execution log of the process. This thesis focuses on support by means of (1) recommendations that provide the user an ordered list of execution alternatives based on estimated utilities and (2) predictions that provide the user general statistics for each execution alternative. Typically, estimations are not an average over all observations, but they are based on observations for "similar" situations. The main question is what similarity means in the context of business process execution. We introduce abstractions on execution traces to capture similarity between execution traces in the log. A trace abstraction considers some trace characteristics rather than the exact trace. Traces that have identical abstraction values are said to be similar. The challenge is to determine those abstractions (characteristics) that are good predictors for the parameter to be estimated in the recommendation or prediction. We analyse the dependency between values of an abstraction and the mean of the parameter to be estimated by means of regression analysis. With regression we obtain a set of abstractions that explain the parameter to be estimated. Dependencies do not only play a role in providing predictions and recommendations to instances at run-time, but they are also essential for simulating the effect of changes in the environment on the processes, both locally and globally. We use stochastic simulation models to simulate the effect of changes in the environment, in particular changed probability distribution caused by recommendations. The novelty of these models is that they include dependencies between abstraction values and simulation parameters, which are estimated from log data. We demonstrate that these models give better approximations of reality than traditional models. A framework for offering operational support has been implemented in the context of the process mining framework ProM

    Decision enhancement and business process agility

    Get PDF

    Decision enhancement and business process agility

    Get PDF

    Decision enhancement and business process agility

    Get PDF

    Workflow Provenance: from Modeling to Reporting

    Get PDF
    Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, error tracking, workflow monitoring, usage pattern discovery, and so on. Integrating provenance into a workflow system or modifying a workflow system to capture or analyze different provenance information is burdensome, requiring extensive development because provenance mechanisms rely heavily on the modelling, architecture, and design of the workflow system. Various tools and technologies exist for logging events in a software system. Unfortunately, logging tools and technologies are not designed for capturing and analyzing provenance information. Workflow provenance is not only about logging, but also about retrieving workflow related information from logs. In this work, we propose a taxonomy of provenance questions and guided by these questions, we created a workflow programming model 'ProvMod' with a supporting run-time library to provide automated provenance and log analysis for any workflow system. The design and provenance mechanism of ProvMod is based on recommendations from prominent research and is easy to integrate into any workflow system. ProvMod offers Neo4j graph database support to manage semi-structured heterogeneous JSON logs. The log structure is adaptable to any NoSQL technology. For each provenance question in our taxonomy, ProvMod provides the answer with data visualization using Neo4j and the ELK Stack. Besides analyzing performance from various angles, we demonstrate the ease of integration by integrating ProvMod with Apache Taverna and evaluate ProvMod usability by engaging users. Finally, we present two Software Engineering research cases (clone detection and architecture extraction) where our proposed model ProvMod and provenance questions taxonomy can be applied to discover meaningful insights

    Toward understanding I/O behavior in HPC workflows

    Get PDF
    Scientific discovery increasingly depends on complex workflows consisting of multiple phases and sometimes millions of parallelizable tasks or pipelines. These workflows access storage resources for a variety of purposes, including preprocessing, simulation output, and postprocessing steps. Unfortunately, most workflow models focus on the scheduling and allocation of com- putational resources for tasks while the impact on storage systems remains a secondary objective and an open research question. I/O performance is not usually accounted for in workflow telemetry reported to users. In this paper, we present an approach to augment the I/O efficiency of the individual tasks of workflows by combining workflow description frameworks with system I/O telemetry data. A conceptual architecture and a prototype implementation for HPC data center deployments are introduced. We also identify and discuss challenges that will need to be addressed by workflow management and monitoring systems for HPC in the future. We demonstrate how real-world applications and workflows could benefit from the approach, and we show how the approach helps communicate performance-tuning guidance to users
    corecore