2,135 research outputs found

    Automated analysis of feature models: Quo vadis?

    Get PDF
    Feature models have been used since the 90's to describe software product lines as a way of reusing common parts in a family of software systems. In 2010, a systematic literature review was published summarizing the advances and settling the basis of the area of Automated Analysis of Feature Models (AAFM). From then on, different studies have applied the AAFM in different domains. In this paper, we provide an overview of the evolution of this field since 2010 by performing a systematic mapping study considering 423 primary sources. We found six different variability facets where the AAFM is being applied that define the tendencies: product configuration and derivation; testing and evolution; reverse engineering; multi-model variability-analysis; variability modelling and variability-intensive systems. We also confirmed that there is a lack of industrial evidence in most of the cases. Finally, we present where and when the papers have been published and who are the authors and institutions that are contributing to the field. We observed that the maturity is proven by the increment in the number of journals published along the years as well as the diversity of conferences and workshops where papers are published. We also suggest some synergies with other areas such as cloud or mobile computing among others that can motivate further research in the future.Ministerio de Economía y Competitividad TIN2015-70560-RJunta de Andalucía TIC-186

    On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow

    Full text link
    Abundant data is the key to successful machine learning. However, supervised learning requires annotated data that are often hard to obtain. In a classification task with limited resources, Active Learning (AL) promises to guide annotators to examples that bring the most value for a classifier. AL can be successfully combined with self-training, i.e., extending a training set with the unlabelled examples for which a classifier is the most certain. We report our experiences on using AL in a systematic manner to train an SVM classifier for Stack Overflow posts discussing performance of software components. We show that the training examples deemed as the most valuable to the classifier are also the most difficult for humans to annotate. Despite carefully evolved annotation criteria, we report low inter-rater agreement, but we also propose mitigation strategies. Finally, based on one annotator's work, we show that self-training can improve the classification accuracy. We conclude the paper by discussing implication for future text miners aspiring to use AL and self-training.Comment: Preprint of paper accepted for the Proc. of the 21st International Conference on Evaluation and Assessment in Software Engineering, 201

    Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL) Optimization Framework

    Get PDF
    Simplicity and flexibility of meta-heuristic optimization algorithms have attracted lots of attention in the field of optimization. Different optimization methods, however, hold algorithm-specific strengths and limitations, and selecting the best-performing algorithm for a specific problem is a tedious task. We introduce a new hybrid optimization framework, entitled Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL), which combines the strengths of different evolutionary algorithms (EAs) in a parallel computing scheme. SC-SAHEL explores performance of different EAs, such as the capability to escape local attractions, speed, convergence, etc., during population evolution as each individual EA suits differently to various response surfaces. The SC-SAHEL algorithm is benchmarked over 29 conceptual test functions, and a real-world hydropower reservoir model case study. Results show that the hybrid SC-SAHEL algorithm is rigorous and effective in finding global optimum for a majority of test cases, and that it is computationally efficient in comparison to algorithms with individual EA

    Social learning in a multi-agent system

    No full text
    In a persistent multi-agent system, it should be possible for new agents to benefit from the accumulated learning of more experienced agents. Parallel reasoning can be applied to the case of newborn animals, and thus the biological literature on social learning may aid in the construction of effective multi-agent systems. Biologists have looked at both the functions of social learning and the mechanisms that enable it. Many researchers have focused on the cognitively complex mechanism of imitation; we will also consider a range of simpler mechanisms that could more easily be implemented in robotic or software agents. Research in artificial life shows that complex global phenomena can arise from simple local rules. Similarly, complex information sharing at the system level may result from quite simple individual learning rules. We demonstrate in simulation that simple mechanisms can outperform imitation in a multi-agent system, and that the effectiveness of any social learning strategy will depend on the agents' environment. Our simple mechanisms have obvious advantages in terms of robustness and design costs

    What kind of questions do developers ask on Stack Overflow? A comparison of automated approaches to classify posts into question categories

    Get PDF
    On question and answer sites, such as Stack Overflow (SO), developers use tags to label the content of a post and to support developers in question searching and browsing. However, these tags mainly refer to technological aspects instead of the purpose of the question. Tagging questions with their purpose can add a new dimension to the identification of discussed topics in posts on SO. In this paper, we aim at automating the classification of SO question posts into seven question categories. As a first step, we harmonized existing taxonomies of question categories and then, we manually classified 1,000 SO questions according to our new taxonomy. Additionally to the question category, we marked the phrases that indicate a question category for each of the posts. We then use this data set to automate the classification of posts using two approaches. For the first approach, we manually analyzed the phrases to find patterns. Based on regular expressions, we implemented a classifier, for each of the categories, that determines whether a post belongs to a category. These regular expressions are derived by analyzing patterns in the phrases. In the second approach, we use the curated data set to train classification models of supervised machine learning algorithms (Random Forest and Support Vector Machines). For the machine learning algorithms, we experimented with 1,312 different configurations regarding the preprocessing of the text and the representation of the input data. Then, we compared the performance of the regex approach with the performance of the best configuration that uses machine learning algorithms on a validation set of 110 posts. The results show that using the regular expression approach, we can classify posts into the correct question category with an average precision and recall of 0.90, and an MCC of 0.68. Additionally, we applied the regex approach on all questions of SO that deal with Android app development and investigated the co-occurrence of question categories in posts. We found that the categories API usage, Conceptual, and Discrepancy are the most frequently assigned question categories and that they also occur together frequently. Our approach can be used to support developers in browsing SO discussions or researchers in building recommender systems based on SO

    Many-objective test suite generation for software product lines

    Get PDF
    A Software Product Line (SPL) is a set of products built from a number of features, the set of valid products being defined by a feature model. Typically, it does not make sense to test all products defined by an SPL and one instead chooses a set of products to test (test selection) and, ideally, derives a good order in which to test them (test prioritisation). Since one cannot know in advance which products will reveal faults, test selection and prioritisation are normally based on objective functions that are known to relate to likely effectiveness or cost. This paper introduces a new technique, the grid-based evolution strategy (GrES), which considers several objective functions that assess a selection or prioritisation and aims to optimise on all of these. The problem is thus a many-objective optimisation problem. We use a new approach, in which all of the objective functions are considered but one (pairwise coverage) is seen as the most important. We also derive a novel evolution strategy based on domain knowledge. The results of the evaluation, on randomly generated and realistic feature models, were promising, with GrES outperforming previously proposed techniques and a range of many-objective optimisation algorithms

    Evaluation of Neuro-Evolution Algorithms for Tactic Volatility Aware Processes

    Get PDF
    Our society is increasingly evolving to rely on computer mechanisms that perform a variety of tasks. From a self-driving car to a satellite in space relaying data from Mars rovers, we need these systems to perform optimally and without failure. One such point of failure these systems can encounter is tactic volatility of an adaptation tactic. Adaptation tactics are defined workflows that allow systems to navigate their environment. Tactic volatility is the variance in the behavior in the attribute of a tactic, such as cost and latency and/or the combination of the two. Current systems consider these tactic attributes to be static. Studies have shown that not accounting for tactic volatility can adversely affect a system\u27s ability to operate effectively and resiliently. To support self-adaptive systems and address their limitations, this paper proposes a Tactic Volatility Aware solution that utilizes eRNN (TVA-E) and addresses the limitations of current self-adaptive systems. For this research, we used real-world data that has been made available for use by researchers and academics. This data contains real-world volatility and helps us demonstrate the positive impact TVA-E when used in self-adaptive systems. We also employ the use of uncertainty reduction tactics and how they can assist in accounting for tactic volatility. This work will serve as an evaluation and a comparison of using different machine learning methods to predict and account for tactic volatility. We will study different predictive mechanisms in this paper: Auto-Regressive Moving Average(ARIMA), Evolving Recurrent Neural Network(eRNN), Multi-Layer Perceptron(MLP), and Support Vector Regression(SVR). These methods will be studied with our TVA-E process and we will analyze how they can enhance a self-adaptive system’s performance when it accounts for tactic volatility

    Why did you clone these identifiers? Using Grounded Theory to understand Identifier Clones

    Get PDF
    Developers spend most of their time comprehending source code, with some studies estimating this activity takes between 58% to 70% of a developer’s time. To improve the readability of source code, and therefore the productivity of developers, it is important to understand what aspects of static code analysis and syntactic code structure hinder the understandability of code. Identifiers are a main source of code comprehension due to their large volume and their role as implicit documentation of a developer’s intent when writing code. Despite the critical role that identifiers play during program comprehension, there are no regulated naming standards for developers to follow when picking identifier names. Our research supports previous work aimed at understanding what makes a good identifier name, and practices to follow when picking names by exploring a phenomenon that occurs during identifier naming: identifier clones. Identifier clones are two or more identifiers that are declared using the same name. This is an important yet unexplored phenomenon in identifier naming where developers intentionally give the same name to two or more identifiers in separate parts of a system. We must study identifier clones to understand it’s impact on program comprehension and to better understand the nature of identifier naming. To accomplish this, we conducted an empirical study on identifier clones detected in open-source software engineered systems and propose a taxonomy of identifier clones containing categories that can explain why they are introduced into systems and whether they represent naming antipatterns

    Can feature requests reveal the refactoring types?

    Get PDF
    Software refactoring is the process of improving the design of a software system while preserving its external behavior. In recent years, refactoring research has been growing as a response to the degradation of software quality. Recent studies performed an in-depth investigation in (1) how refactoring practices are taking place during the software evolution, (2) how to recommend refactoring to improve the design of software, and (3) what type of refactoring operations can be implemented. However, there is a lack of support when it comes to developers’ typical programming tasks, including feature updates and bug fixes. The goal of this thesis is to investigate whether it is possible to support the developer through recommending appropriate refactoring types to be performed when the developer is assigned a given issue to handle. Our proposed solution will take as input the text of the issue along with the source code and tries to protect the appropriate refactoring type that would help in adapting efficiently the existing source code to the given feature request. To do so, we rely on the use of supervised learning. We start with collecting various issues that were handled using refactoring. This data will be used to train a model that will be able to predict the appropriate refactoring, given as input an Open issue description. We design a classification model that inputs a feature request and suggests a method-level refactoring. The classification model was trained with a total of 4008 feature request examples of four refactoring types. Our initial results show that this solution suffers from several challenges including the class imbalance: not all refactoring types are equally used to handle issues. Another challenge we detected is related to the description of the issue itself which typically does not explicitly mention any potential refactoring. Therefore, there will be a need for a large set of issues to be able to appropriately learn any patterns among them that would discriminate towards a given refactoring type

    Advancing Climate Change Research and Hydrocarbon Leak Detection : by Combining Dissolved Carbon Dioxide and Methane Measurements with ADCP Data

    Get PDF
    With the emergence of largescale, comprehensive environmental monitoring projects, there is an increased need to combine state-of-the art technologies to address complicated problems such as ocean acidifi cation and hydrocarbon leak detection
    corecore