21 research outputs found

    Studying Software Engineering Patterns for Designing Machine Learning Systems

    Full text link
    Machine-learning (ML) techniques have become popular in the recent years. ML techniques rely on mathematics and on software engineering. Researchers and practitioners studying best practices for designing ML application systems and software to address the software complexity and quality of ML techniques. Such design practices are often formalized as architecture patterns and design patterns by encapsulating reusable solutions to commonly occurring problems within given contexts. However, to the best of our knowledge, there has been no work collecting, classifying, and discussing these software-engineering (SE) design patterns for ML techniques systematically. Thus, we set out to collect good/bad SE design patterns for ML techniques to provide developers with a comprehensive and ordered classification of such patterns. We report here preliminary results of a systematic-literature review (SLR) of good/bad design patterns for ML

    Seamless Reconfiguration of Rule-Based IoT Applications

    Get PDF
    International audienceThe Internet of Things (IoT) consists of devices and software interacting altogether in order to build powerful and added-value services. One of the main challenges in this context is to support end-users with simple, user-friendly, and automated techniques to design such applications. Given the dynamicity of IoT applications, these techniques should consider that these applications are in most cases not built once and for all. They can evolve over time and objects may be added or removed for several reasons (replacement, loss of connectivity, upgrade, failure, etc.). In this paper, we propose new techniques for supporting the reconfiguration of running IoT applications. These techniques compare two versions of the application (before and after reconfiguration) to check if several properties of interest from a reconfiguration perspective are preserved. The analysis techniques have been implemented using the Maude framework and integrated into the WebThings platform

    On-Premise AIOps Infrastructure for a Software Editor SME: An Experience Report

    Full text link
    Information Technology has become a critical component in various industries, leading to an increased focus on software maintenance and monitoring. With the complexities of modern software systems, traditional maintenance approaches have become insufficient. The concept of AIOps has emerged to enhance predictive maintenance using Big Data and Machine Learning capabilities. However, exploiting AIOps requires addressing several challenges related to the complexity of data and incident management. Commercial solutions exist, but they may not be suitable for certain companies due to high costs, data governance issues, and limitations in covering private software. This paper investigates the feasibility of implementing on-premise AIOps solutions by leveraging open-source tools. We introduce a comprehensive AIOps infrastructure that we have successfully deployed in our company, and we provide the rationale behind different choices that we made to build its various components. Particularly, we provide insights into our approach and criteria for selecting a data management system and we explain its integration. Our experience can be beneficial for companies seeking to internally manage their software maintenance processes with a modern AIOps approach

    MuDelta: Delta-Oriented Mutation Testing at Commit Time

    Get PDF
    To effectively test program changes using mutation testing, one needs to use mutants that are relevant to the altered program behaviours. In view of this, we introduce MuDelta, an approach that identifies commit-relevant mutants; mutants that affect and are affected by the changed program behaviours. Our approach uses machine learning applied on a combined scheme of graph and vector-based representations of static code features. Our results, from 50 commits in 21 Coreutils programs, demonstrate a strong prediction ability of our approach; yielding 0.80 (ROC) and 0.50 (PR Curve) AUC values with 0.63 and 0.32 precision and recall values. These predictions are significantly higher than random guesses, 0.20 (PR-Curve) AUC, 0.21 and 0.21 precision and recall, and subsequently lead to strong relevant tests that kill 45%more relevant mutants than randomly sampled mutants (either sampled from those residing on the changed component(s) or from the changed lines). Our results also show that MuDelta selects mutants with 27% higher fault revealing ability in fault introducing commits. Taken together, our results corroborate the conclusion that commit-based mutation testing is suitable and promising for evolving software

    Heterogeneous Anomaly Detection for Software Systems via Semi-supervised Cross-modal Attention

    Full text link
    Prompt and accurate detection of system anomalies is essential to ensure the reliability of software systems. Unlike manual efforts that exploit all available run-time information, existing approaches usually leverage only a single type of monitoring data (often logs or metrics) or fail to make effective use of the joint information among different types of data. Consequently, many false predictions occur. To better understand the manifestations of system anomalies, we conduct a systematical study on a large amount of heterogeneous data, i.e., logs and metrics. Our study demonstrates that logs and metrics can manifest system anomalies collaboratively and complementarily, and neither of them only is sufficient. Thus, integrating heterogeneous data can help recover the complete picture of a system's health status. In this context, we propose Hades, the first end-to-end semi-supervised approach to effectively identify system anomalies based on heterogeneous data. Our approach employs a hierarchical architecture to learn a global representation of the system status by fusing log semantics and metric patterns. It captures discriminative features and meaningful interactions from heterogeneous data via a cross-modal attention module, trained in a semi-supervised manner. We evaluate Hades extensively on large-scale simulated data and datasets from Huawei Cloud. The experimental results present the effectiveness of our model in detecting system anomalies. We also release the code and the annotated dataset for replication and future research.Comment: In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). arXiv admin note: substantial text overlap with arXiv:2207.0291

    What constitutes a machine-learning-driven business model? A taxonomy of B2B start-ups with machine learning at their core

    Get PDF
    Artificial intelligence, specifically machine learning (ML), technologies are powerfully driving business model innovation in organizations against the backdrop of increasing digitalization. The resulting novel business models are profoundly shaped by ML, a technology that brings about unique opportunities and challenges. However, to date, little research examines what exactly constitutes these business models that use ML at their core and how they can be distinguished. Therefore, this study aims to contribute to an increased understanding of the anatomy of ML-driven business models in the business-to-business segment. To this end, we develop a taxonomy that allows researchers and practitioners to differentiate these ML-driven business models according to their characteristics along ten dimensions. Additionally, we derive archetypes of ML-driven business models through a cluster analysis based on the characteristics of 102 start-ups from the database Crunchbase. Our results are cross-industry, providing fertile soil for expansion through future investigations

    A Domain-Specific Language for Monitoring ML Model Performance

    Get PDF
    As machine learning (ML) starts to offer competitive advantages for an increasing number of application domains, many organisations invest in developing ML-enabled products. The development of these products poses unique challenges compared to traditional software engineering projects and requires the collaboration of people from different disciplines. This work focuses on alleviating some of these challenges related to implementing monitoring systems for deployed ML models. To this end, a domain-specific language (DSL) is developed that data scientists can use to declaratively define monitoring workflows. Complementary to the DSL, a runtime component is developed that implements the specified behaviour. This component is designed to be easily integrated with the rest of an organisation's ML platform and extended by software engineers that do not necessarily have experience with model-driven engineering. An evaluation of the proposed system that supports the validity of the approach is also presented

    Reducing the Length of Field-replay Based Load Testing

    Get PDF
    With the development of software, load testing have become more and more important. Load testing can ensure the software system can provide quality service under a certain load. Therefore, one of the common challenges of load testing is to design realistic workloads that can represent the actual workload in the field. In particular, one of the most widely adopted and intuitive approaches is to directly replay the field workloads in the load testing environment, which is resource- and time-consuming. In this work, we propose an automated approach to reduce the length of load testing that is driven by replaying the field workloads. The intuition of our approach is: if the measured performance associated with a particular system behaviour is already stable, we can skip subsequent testing of this system behaviour to reduce the length of the field workloads. In particular, our approach first clusters execution logs that are generated during the system runtime to identify similar system behaviours during the field workloads. Then, we use statistical methods to determine whether the measured performance associated with a system behaviour has been stable. We evaluate our approach on three open-source projects (i.e., OpenMRS, TeaStore, and Apache James). The results show that our approach can significantly reduce the length of field workloads while the workloads-after-reduction produced by our approach are representative of the original set of workloads. More importantly, the load testing results obtained by replaying the workloads after the reduction have high correlation and similar trend with the original set of workloads. Practitioners can leverage our approach to perform realistic field-replay based load testing while saving the needed resources and time
    corecore