81 research outputs found

    An Empirical Study of Cohesion and Coupling: Balancing Optimisation and Disruption

    Get PDF
    Search based software engineering has been extensively applied to the problem of finding improved modular structures that maximise cohesion and minimise coupling. However, there has, hitherto, been no longitudinal study of developers’ implementations, over a series of sequential releases. Moreover, results validating whether developers respect the fitness functions are scarce, and the potentially disruptive effect of search-based remodularisation is usually overlooked. We present an empirical study of 233 sequential releases of 10 different systems; the largest empirical study reported in the literature so far, and the first longitudinal study. Our results provide evidence that developers do, indeed, respect the fitness functions used to optimise cohesion/coupling (they are statistically significantly better than arbitrary choices with p << 0.01), yet they also leave considerable room for further improvement (cohesion/coupling can be improved by 25% on average). However, we also report that optimising the structure is highly disruptive (on average more than 57% of the structure must change), while our results reveal that developers tend to avoid such disruption. Therefore, we introduce and evaluate a multi-objective evolutionary approach that minimises disruption while maximising cohesion/coupling improvement. This allows developers to balance reticence to disrupt existing modular structure, against their competing need to improve cohesion and coupling. The multi-objective approach is able to find modular structures that improve the cohesion of developers’ implementations by 22.52%, while causing an acceptably low level of disruption (within that already tolerated by developers)

    A multiple hill climbing approach to software module clustering

    Get PDF
    Automated software module clustering is important for maintenance of legacy systems written in a 'monolithic format' with inadequate module boundaries. Even where systems were originally designed with suitable module boundaries, structure tends to degrade as the system evolves, making re-modularization worthwhile. This paper focuses upon search-based approaches to the automated module clustering problem, where hitherto, the local search approach of hill climbing has been found to be most successful. In the paper we show that results from a set of multiple hill climbs can be combined to locate good 'building blocks' for subsequent searches. Building blocks are formed by identifying the common features in a selection of best hill climbs. This process reduces the search space, while simultaneously 'hard wiring' parts of the solution. The paper reports the results of an empirical study that show that the multiple hill climbing approach does indeed guide the search to higher peaks in subsequent executions. The paper also investigates the relationship between the improved results and the system size

    Software restructuring: understanding longitudinal architectural changes and refactoring

    Get PDF
    The complexity of software systems increases as the systems evolve. As the degradation of the system's structure accumulates, maintenance effort and defect-proneness tend to increase. In addition, developers often opt to employ sub-optimal solutions in order to achieve short-time goals, in a phenomenon that has been recently called technical debt. In this context, software restructuring serves as a way to alleviate and/or prevent structural degradation. Restructuring of software is usually performed in either higher or lower levels of granularity, where the first indicates broader changes in the system's structural architecture and the latter indicates refactorings performed to fewer and localised code elements. Although tools to assist architectural changes and refactoring are available, there is still no evidence these approaches are widely adopted by practitioners. Hence, an understanding of how developers perform architectural changes and refactoring in their daily basis and in the context of the software development processes they adopt is necessary. Current software development is iterative and incremental with short cycles of development and release. Thus, tools and processes that enable this development model, such as continuous integration and code review, are widespread among software engineering practitioners. Hence, this thesis investigates how developers perform longitudinal and incremental architectural changes and refactoring during code review through a wide range of empirical studies that consider different moments of the development lifecycle, different approaches, different automated tools and different analysis mechanisms. Finally, the observations and conclusions drawn from these empirical investigations extend the existing knowledge on how developers restructure software systems, in a way that future studies can leverage this knowledge to propose new tools and approaches that better fit developers' working routines and development processes

    Search based software engineering: Trends, techniques and applications

    Get PDF
    © ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is available from the link below.In the past five years there has been a dramatic increase in work on Search-Based Software Engineering (SBSE), an approach to Software Engineering (SE) in which Search-Based Optimization (SBO) algorithms are used to address problems in SE. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives. This article provides a review and classification of literature on SBSE. The work identifies research trends and relationships between the techniques applied and the applications to which they have been applied and highlights gaps in the literature and avenues for further research.EPSRC and E

    Product modularity : a multi-objective configuration approach

    Get PDF
    Product modularity is often seen as a means by which a product system can be decomposed into smaller, more manageable chunks in order to better manage design, manufacturing and after-sales complexity. The most common approach is to decompose the product down to component level and then group the components to form modules. The rationale for module grouping can vary, from the more technical physical and functional component interactions, to any number of strategic objectives such as variety, maintenance and recycling. The problem lies with the complexity of product modularity under these multiple (often conflicting) objectives. The research in this thesis presents a holistic multi-objective computer aided modularity optimisation (CAMO) framework. The framework consists of four main steps: 1) product decomposition; 2) interaction analysis; 3) formation of modular architectures and; 4) scenario analysis. In summary of these steps: the product is first decomposed into a number a basic components by analysis of both the physical and functional product domains. The various dependencies and strategic similarities that occur between the product s components are then analysed and entered into a number of interaction matrixes. A specially developed multi-objective grouping genetic algorithm (MOGGA) then searches the matrices and provides a whole set of alternative (yet optimal) modular product configurations. The solution set is then evaluated and explored (scenario analysis) using the principles of Analytic Hierarchy Process. A software prototype has been created for the CAMO framework using Visual Basic to create a multi-objective genetic algorithm (GA) based optimiser within an excel environment. A case study has been followed to demonstrate the various steps of the framework and make comparisons with previous works. Unlike previous works, that have used simplistic optimisation algorithms and have in general only considered a limited number of modularisation objectives, the developed framework provides a true multi-objective approach to the product modularisation problem.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Cooperative Based Software Clustering on Dependency Graphs

    Get PDF
    The organization of software systems into subsystems is usually based on the constructs of packages or modules and has a major impact on the maintainability of the software. However, during software evolution, the organization of the system is subject to continual modification, which can cause it to drift away from the original design, often with the effect of reducing its quality. A number of techniques for evaluating a system's maintainability and for controlling the effort required to conduct maintenance activities involve software clustering. Software clustering refers to the partitioning of software system components into clusters in order to obtain both exterior and interior connectivity between these components. It helps maintainers enhance the quality of software modularization and improve its maintainability. Research in this area has produced numerous algorithms with a variety of methodologies and parameters. This thesis presents a novel ensemble approach that synthesizes a new solution from the outcomes of multiple constituent clustering algorithms. The main principle behind this approach derived from machine learning, as applied to document clustering, but it has been modified, both conceptually and empirically, for use in software clustering. The conceptual modifications include working with a variable number of clusters produced by the input algorithms and employing graph structures rather than feature vectors. The empirical modifications include experiments directed at the selection of the optimal cluster merging criteria. Case studies based on open source software systems show that establishing cooperation between leading state-of-the-art algorithms produces better clustering results compared with those achieved using only one of any of the algorithms considered

    An Exploratory Study of the Inputs for Ensemble Clustering Technique as a Subset Selection Problem

    Get PDF
    Ensemble and Consensus Clustering address the problem of unifying multiple clustering results into a single output to best reflect the agreement of input methods. They can be used to obtain more stable and robust clustering results in comparison with a single clustering approach. In this study, we propose a novel subset selection method that looks at controlling the number of clustering inputs and datasets in an efficient way. The authors propose a number of manual selection and heuristic search techniques to perform the selection. Our investi‐ gation and experiments demonstrate very promising results. Using these techni‐ ques can ensure better selection methods and datasets for Ensemble and Consensus Clustering and thus more efficient clustering results

    A Systematic Literature Review on Software Refactoring

    Full text link
    Due to the growing complexity of software systems, there has been a dramatic increase in research and industry demand on refactoring. Refactoring research nowadays addresses challenges beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommending specific refactoring activities, detecting refactoring opportunities and testing the correctness of applied refactoring. Very few studies focused on the challenges that practitioners face when refactoring software systems and what should be the current refactoring research focus from the developers’perspective and based on the current literature. Without such knowledge, tool builders invest in the wrong direction, and researchers miss many opportunities for improving the practice of refactoring. In this thesis, we collected papers from several publication sources and analyzed them to identify what do developers ask about refactoring and the relevant topics in the field We found that developers and researchers are asking about design patterns, design and user interface refactoring, web services, parallel programming, and mobile apps. We also identified what popular refactoring challenges are the most difficult and the current important topics and questions related to refactoring. Moreover, we discovered gaps between existing research on refactoring and the challenges developers face.Master of ScienceSoftware Engineering, College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/154827/1/Jallal Elhazzat Final Thesis.pdfDescription of Jallal Elhazzat Final Thesis.pdf : Thesi

    The role of Artificial Intelligence in Software Engineering

    Full text link
    There has been a recent surge in interest in the application of Artificial Intelligence (AI) techniques to Software Engineering (SE) problems. The work is typified by recent advances in Search Based Software Engineering, but also by long established work in Probabilistic reasoning and machine learning for Software Engineering. This paper explores some of the relationships between these strands of closely related work, arguing that they have much in common and sets out some future challenges in the area of AI for SE. © 2012 IEEE

    Data classification using genetic programming.

    Get PDF
    Master of Science in Computer Science.Genetic programming (GP), a field of artificial intelligence, is an evolutionary algorithm which evolves a population of trees which represent programs. These programs are used to solve problems. This dissertation investigates the use of genetic programming for data classification. In machine learning, data classification is the process of allocating a class label to an instance of data. A classifier is created in order to perform these allocations. Several studies have investigated the use of GP to solve data classification problems. These studies have shown that GP is able to create classifiers with high classification accuracies. However, there are certain aspects which have not previously been investigated. Five areas were investigated in this dissertation. The first was an investigation into how discretisation could be incorporated into a GP algorithm. An adaptive discretisation algorithm was proposed, and outperformed certain existing methods. The second was a comparison of GP representations for binary data classification. The findings indicated that from the representations examined (arithmetic trees, decision trees, and logical trees), the decision trees performed the best. The third was to investigate the use of the encapsulation genetic operator and its effect on data classification. The findings revealed that an improvement in both training and test results was achieved when encapsulation was incorporated. The fourth was an investigative analysis of several hybridisations of a GP algorithm with a genetic algorithm in order to evolve a population of ensembles. Four methods were proposed and these methods outperformed certain existing GP and ensemble methods. Finally, the fifth area was to investigate an ensemble construction method for classification. In this approach GP evolved a single ensemble. The proposed method resulted in an improvement in training and test accuracy when compared to the standard GP algorithm. The methods proposed in this dissertation were tested on publicly available data sets, and the results were statistically tested in order to determine the effectiveness of the proposed approaches
    corecore