1,955 research outputs found
MEG: Multi-objective Ensemble Generation for Software Defect Prediction
Background: Defect Prediction research aims at assisting software
engineers in the early identification of software defect during the
development process. A variety of automated approaches, ranging from traditional classification models to more sophisticated
learning approaches, have been explored to this end. Among these,
recent studies have proposed the use of ensemble prediction models
(i.e., aggregation of multiple base classifiers) to build more robust
defect prediction models. /
Aims: In this paper, we introduce a novel
approach based on multi-objective evolutionary search to automatically generate defect prediction ensembles. Our proposal is not
only novel with respect to the more general area of evolutionary
generation of ensembles, but it also advances the state-of-the-art
in the use of ensemble in defect prediction. /
Method: We assess
the effectiveness of our approach, dubbed as Multi-objective
Ensemble Generation (MEG), by empirically benchmarking it
with respect to the most related proposals we found in the literature
on defect prediction ensembles and on multi-objective evolutionary
ensembles (which, to the best of our knowledge, had never been
previously applied to tackle defect prediction). /
Result: Our results
show that MEG is able to generate ensembles which produce similar
or more accurate predictions than those achieved by all the other
approaches considered in 73% of the cases (with favourable large
effect sizes in 80% of them). /
Conclusions: MEG is not only able
to generate ensembles that yield more accurate defect predictions
with respect to the benchmarks considered, but it also does it automatically, thus relieving the engineers from the burden of manual
design and experimentation
Genetic algorithm-based multi-objective optimization model for software bugs prediction
The accuracy and reliability of software are critical factors for consideration in the operation of any electronic or computing device. Although, there exist several conventional methods of software bugs prediction which depend solely on static code metrics without syntactic structures or semantic information of programs which are more appropriate for developing accurate predictive models. In this paper, software bugs are predicted using a Genetic Algorithm (GA)-based multi-objective optimization model implemented in MATLAB on the National Aeronautics and Space Administration (NASA) dataset comprising thirty-eight distinct factors reduced to six (6) major factors via the use of the Principal Component Analysis (PCA) algorithm with SPSS, after which a linear regression equation was derived. The developed GA- based multi-objective optimization model was well-tried and tested. The accuracy and sensitivity level were also analyzed for successful bug detection. The results for optimal values ranging from 95% to 97% were recorded at an average accuracy of 96.4% derived through MATLAB-implemented measures of critical similarities. The research findings reveal that the model hereto proposed will provide an effective solution to the problem of predicting buggy software in general circulation
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good
combination of control parameters for a data miner. While widely applied in
empirical Software Engineering, there has not been much discussion on which
hyperparameter tuner is best for software analytics. To address this gap in the
literature, this paper applied a range of hyperparameter optimizers (grid
search, random search, differential evolution, and Bayesian optimization) to
defect prediction problem. Surprisingly, no hyperparameter optimizer was
observed to be `best' and, for one of the two evaluation measures studied here
(F-measure), hyperparameter optimization, in 50\% cases, was no better than
using default configurations.
We conclude that hyperparameter optimization is more nuanced than previously
believed. While such optimization can certainly lead to large improvements in
the performance of classifiers used in software analytics, it remains to be
seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1
Intelligent Web Services Architecture Evolution Via An Automated Learning-Based Refactoring Framework
Architecture degradation can have fundamental impact on software quality and productivity, resulting in inability to support new features, increasing technical debt and leading to significant losses. While code-level refactoring is widely-studied and well supported by tools, architecture-level refactorings, such as repackaging to group related features into one component, or retrofitting files into patterns, remain to be expensive and risky. Serval domains, such as Web services, heavily depend on complex architectures to design and implement interface-level operations, provided by several companies such as FedEx, eBay, Google, Yahoo and PayPal, to the end-users. The objectives of this work are: (1) to advance our ability to support complex architecture refactoring by explicitly defining Web service anti-patterns at various levels of abstraction, (2) to enable complex refactorings by learning from user feedback and creating reusable/personalized refactoring strategies to augment intelligent designers’ interaction that will guide low-level refactoring automation with high-level abstractions, and (3) to enable intelligent architecture evolution by detecting, quantifying, prioritizing, fixing and predicting design technical debts. We proposed various approaches and tools based on intelligent computational search techniques for (a) predicting and detecting multi-level Web services antipatterns, (b) creating an interactive refactoring framework that integrates refactoring path recommendation, design-level human abstraction, and code-level refactoring automation with user feedback using interactive mutli-objective search, and (c) automatically learning reusable and personalized refactoring strategies for Web services by abstracting recurring refactoring patterns from Web service releases. Based on empirical validations performed on both large open source and industrial services from multiple providers (eBay, Amazon, FedEx and Yahoo), we found that the proposed approaches advance our understanding of the correlation and mutual impact between service antipatterns at different levels, revealing when, where and how architecture-level anti-patterns the quality of services. The interactive refactoring framework enables, based on several controlled experiments, human-based, domain-specific abstraction and high-level design to guide automated code-level atomic refactoring steps for services decompositions. The reusable refactoring strategy packages recurring refactoring activities into automatable units, improving refactoring path recommendation and further reducing time-consuming and error-prone human intervention.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/142810/1/Wang Final Dissertation.pdfDescription of Wang Final Dissertation.pdf : Dissertatio
How to Evaluate Solutions in Pareto-based Search-Based Software Engineering? A Critical Review and Methodological Guidance
With modern requirements, there is an increasing tendency of considering
multiple objectives/criteria simultaneously in many Software Engineering (SE)
scenarios. Such a multi-objective optimization scenario comes with an important
issue -- how to evaluate the outcome of optimization algorithms, which
typically is a set of incomparable solutions (i.e., being Pareto non-dominated
to each other). This issue can be challenging for the SE community,
particularly for practitioners of Search-Based SE (SBSE). On one hand,
multi-objective optimization could still be relatively new to SE/SBSE
researchers, who may not be able to identify the right evaluation methods for
their problems. On the other hand, simply following the evaluation methods for
general multi-objective optimization problems may not be appropriate for
specific SE problems, especially when the problem nature or decision maker's
preferences are explicitly/implicitly available. This has been well echoed in
the literature by various inappropriate/inadequate selection and
inaccurate/misleading use of evaluation methods. In this paper, we first carry
out a systematic and critical review of quality evaluation for multi-objective
optimization in SBSE. We survey 717 papers published between 2009 and 2019 from
36 venues in seven repositories, and select 95 prominent studies, through which
we identify five important but overlooked issues in the area. We then conduct
an in-depth analysis of quality evaluation indicators/methods and general
situations in SBSE, which, together with the identified issues, enables us to
codify a methodological guidance for selecting and using evaluation methods in
different SBSE scenarios.Comment: This paper has been accepted by IEEE Transactions on Software
Engineering, available as full OA:
https://ieeexplore.ieee.org/document/925218
Search-Based Software Maintenance and Testing
2012 - 2013In software engineering there are many expensive tasks that are performed during development
and maintenance activities. Therefore, there has been a lot of e ort to try to automate these
tasks in order to signi cantly reduce the development and maintenance cost of software, since
the automation would require less human resources. One of the most used way to make such
an automation is the Search-Based Software Engineering (SBSE), which reformulates traditional
software engineering tasks as search problems. In SBSE the set of all candidate solutions to the
problem de nes the search space while a tness function di erentiates between candidate solutions
providing a guidance to the optimization process. After the reformulation of software engineering
tasks as optimization problems, search algorithms are used to solve them. Several search algorithms
have been used in literature, such as genetic algorithms, genetic programming, simulated annealing,
hill climbing (gradient descent), greedy algorithms, particle swarm and ant colony.
This thesis investigates and proposes the usage of search based approaches to reduce the e ort
of software maintenance and software testing with particular attention to four main activities: (i)
program comprehension; (ii) defect prediction; (iii) test data generation and (iv) test suite optimiza-
tion for regression testing. For program comprehension and defect prediction, this thesis provided
their rst formulations as optimization problems and then proposed the usage of genetic algorithms
to solve them. More precisely, this thesis investigates the peculiarity of source code against textual
documents written in natural language and proposes the usage of Genetic Algorithms (GAs) in
order to calibrate and assemble IR-techniques for di erent software engineering tasks. This thesis
also investigates and proposes the usage of Multi-Objective Genetic Algorithms (MOGAs) in or-
der to build multi-objective defect prediction models that allows to identify defect-prone software
components by taking into account multiple and practical software engineering criteria.
Test data generation and test suite optimization have been extensively investigated as search-
based problems in literature . However, despite the huge body of works on search algorithms
applied to software testing, both (i) automatic test data generation and (ii) test suite optimization
present several limitations and not always produce satisfying results. The success of evolutionary
software testing techniques in general, and GAs in particular, depends on several factors. One of
these factors is the level of diversity among the individuals in the population, which directly a ects
the exploration ability of the search. For example, evolutionary test case generation techniques that
employ GAs could be severely a ected by genetic drift, i.e., a loss of diversity between solutions,
which lead to a premature convergence of GAs towards some local optima. For these reasons,
this thesis investigate the role played by diversity preserving mechanisms on the performance of
GAs and proposed a novel diversity mechanism based on Singular Value Decomposition and linear
algebra. Then, this mechanism has been integrated within the standard GAs and evaluated for
evolutionary test data generation. It has been also integrated within MOGAs and empirically
evaluated for regression testing. [edited by author]XII n.s
Search based software engineering: Trends, techniques and applications
© ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is available from the link below.In the past five years there has been a dramatic increase in work on Search-Based Software Engineering (SBSE), an approach to Software Engineering (SE) in which Search-Based Optimization (SBO) algorithms are used to address problems in SE. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives.
This article provides a review and classification of literature on SBSE. The work identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.EPSRC and E
- …