2,949 research outputs found

    Detecting feature influences to quality attributes in large and partially measured spaces using smart sampling and dynamic learning

    Get PDF
    Emergent application domains (e.g., Edge Computing/Cloud/B5G systems) are complex to be built manually. They are characterised by high variability and are modelled by large Variability Models (VMs), leading to large configuration spaces. Due to the high number of variants present in such systems, it is challenging to find the best-ranked product regarding particular Quality Attributes (QAs) in a short time. Moreover, measuring QAs sometimes is not trivial, requiring a lot of time and resources, as is the case of the energy footprint of software systems — the focus of this paper. Hence, we need a mechanism to analyse how features and their interactions influence energy footprint, but without measuring all configurations. While practical, sampling and predictive techniques base their accuracy on uniform spaces or some initial domain knowledge, which are not always possible to achieve. Indeed, analysing the energy footprint of products in large configuration spaces raises specific requirements that we explore in this work. This paper presents SAVRUS (Smart Analyser of Variability Requirements in Unknown Spaces), an approach for sampling and dynamic statistical learning without relying on initial domain knowledge of large and partially QA-measured spaces. SAVRUS reports the degree to which features and pairwise interactions influence a particular QA, like energy efficiency. We validate and evaluate SAVRUS with a selection of likewise systems, which define large searching spaces containing scattered measurements.Funding for open access charge: Universidad de Málaga / CBUA. This work is supported by the European Union’s H2020 re search and innovation programme under grant agreement DAEMON H2020-101017109, by the projects IRIS PID2021-12281 2OB-I00 (co-financed by FEDER funds), Rhea P18-FR-1081 (MCI/AEI/ FEDER, UE), and LEIA UMA18-FEDERIA-157, and the PRE2019-087496 grant from the Ministerio de Ciencia e Innovación, Spain

    Training Knowledge Graph Embedding Models

    Get PDF
    Knowledge graph embedding (KGE) models have become popular means for making discoveries in knowledge graphs (e.g., RDF graphs) in an efficient and scalable manner. The key to success of these models is their ability to learn low-rank vector representations for knowledge graph entities and relations. Despite the rapid development of KGE models, state-of-the-art approaches have mostly focused on new ways to represent embeddings interaction functions (i.e., scoring functions). In this paper, we argue that the choice of other training components such as the loss function, hyperparameters and negative sampling strategies can also have substantial impact on the model efficiency. This area has been rather neglected by previous works so far and our contribution is towards closing this gap by a thorough analysis of possible choices of training loss functions, hyperparameters and negative sampling techniques. We finally investigate the effects of specific choices on the scalability and accuracy of knowledge graph embedding models.Knowledge graph embedding (KGE) models have become popular means for making discoveries in knowledge graphs (e.g., RDF graphs) in an efficient and scalable manner. The key to success of these models is their ability to learn low-rank vector representations for knowledge graph entities and relations. Despite the rapid development of KGE models, state-of-the-art approaches have mostly focused on new ways to represent embeddings interaction functions (i.e., scoring functions). In this paper, we argue that the choice of other training components such as the loss function, hyperparameters and negative sampling strategies can also have substantial impact on the model efficiency. This area has been rather neglected by previous works so far and our contribution is towards closing this gap by a thorough analysis of possible choices of training loss functions, hyperparameters and negative sampling techniques. We finally investigate the effects of specific choices on the scalability and accuracy of knowledge graph embedding models

    Predicting Software Performance with Divide-and-Learn

    Full text link
    Predicting the performance of highly configurable software systems is the foundation for performance testing and quality assurance. To that end, recent work has been relying on machine/deep learning to model software performance. However, a crucial yet unaddressed challenge is how to cater for the sparsity inherited from the configuration landscape: the influence of configuration options (features) and the distribution of data samples are highly sparse. In this paper, we propose an approach based on the concept of 'divide-and-learn', dubbed DaLDaL. The basic idea is that, to handle sample sparsity, we divide the samples from the configuration landscape into distant divisions, for each of which we build a regularized Deep Neural Network as the local model to deal with the feature sparsity. A newly given configuration would then be assigned to the right model of division for the final prediction. Experiment results from eight real-world systems and five sets of training data reveal that, compared with the state-of-the-art approaches, DaLDaL performs no worse than the best counterpart on 33 out of 40 cases (within which 26 cases are significantly better) with up to 1.94×1.94\times improvement on accuracy; requires fewer samples to reach the same/better accuracy; and producing acceptable training overhead. Practically, DaLDaL also considerably improves different global models when using them as the underlying local models, which further strengthens its flexibility. To promote open science, all the data, code, and supplementary figures of this work can be accessed at our repository: https://github.com/ideas-labo/DaL.Comment: This paper has been accepted by The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 202

    Iterative robust search (iRoSe) : a framework for coevolutionary hyperparameter optimisation

    Get PDF
    Finding an optimal hyperparameter configuration for machine learning algorithms is challenging due to hyperparameter effects that could vary with algorithms, dataset and distribution, as also due to the large combinatorial search space of hyperparameter values requiring expensive trials. Furthermore, extant optimisation procedures that search out optima randomly and in a manner non-specific to the optimisation problem, when viewed through the "No Free Lunches" theorem, could be considered a priori unjustifiable. In seeking a coevolutionary, adaptive strategy that robustifies the search for optimal hyperparameter values, we investigate specifics of the optimisation problem through 'macro-modelling' that abstracts out the complexity of the algorithm in terms of signal, control factors, noise factors and response. We design and run a budgeted number of 'proportionally balanced' trials using a predetermined mix of candidate control factors. Based on the responses from these proportional trials, we conduct 'main effects analysis' of individual hyperparameters of the algorithm, in terms of the signal to noise ratio, to derive hyperparameter configurations that enhance targeted performance characteristics through additivity. We formulate an iterative Robust Search (iRoSe) hyperparameter optimisation framework that leverages these problem-specific insights. Initialised with a valid hyperparameter configuration, iRoSe evidences ability to adaptively converge to a configuration that produces effective gain in performance characteristic, through designed search trials that are justifiable through extant theory. We demonstrate the iRoSe optimisation framework on a Deep Neural Network and CIFAR-10 dataset, comparing it to Bayesian optimisation procedure, to highlight the transformation achieved

    Automated analysis of feature models: Quo vadis?

    Get PDF
    Feature models have been used since the 90's to describe software product lines as a way of reusing common parts in a family of software systems. In 2010, a systematic literature review was published summarizing the advances and settling the basis of the area of Automated Analysis of Feature Models (AAFM). From then on, different studies have applied the AAFM in different domains. In this paper, we provide an overview of the evolution of this field since 2010 by performing a systematic mapping study considering 423 primary sources. We found six different variability facets where the AAFM is being applied that define the tendencies: product configuration and derivation; testing and evolution; reverse engineering; multi-model variability-analysis; variability modelling and variability-intensive systems. We also confirmed that there is a lack of industrial evidence in most of the cases. Finally, we present where and when the papers have been published and who are the authors and institutions that are contributing to the field. We observed that the maturity is proven by the increment in the number of journals published along the years as well as the diversity of conferences and workshops where papers are published. We also suggest some synergies with other areas such as cloud or mobile computing among others that can motivate further research in the future.Ministerio de Economía y Competitividad TIN2015-70560-RJunta de Andalucía TIC-186
    corecore