1,310 research outputs found

    Management Aspects of Software Clone Detection and Analysis

    Get PDF
    Copying a code fragment and reusing it by pasting with or without minor modifications is a common practice in software development for improved productivity. As a result, software systems often have similar segments of code, called software clones or code clones. Due to many reasons, unintentional clones may also appear in the source code without awareness of the developer. Studies report that significant fractions (5% to 50%) of the code in typical software systems are cloned. Although code cloning may increase initial productivity, it may cause fault propagation, inflate the code base and increase maintenance overhead. Thus, it is believed that code clones should be identified and carefully managed. This Ph.D. thesis contributes in clone management with techniques realized into tools and large-scale in-depth analyses of clones to inform clone management in devising effective techniques and strategies. To support proactive clone management, we have developed a clone detector as a plug-in to the Eclipse IDE. For clone detection, we used a hybrid approach that combines the strength of both parser-based and text-based techniques. To capture clones that are similar but not exact duplicates, we adopted a novel approach that applies a suffix-tree-based k-difference hybrid algorithm, borrowed from the area of computational biology. Instead of targeting all clones from the entire code base, our tool aids clone-aware development by allowing focused search for clones of any code fragment of the developer's interest. A good understanding on the code cloning phenomenon is a prerequisite to devise efficient clone management strategies. The second phase of the thesis includes large-scale empirical studies on the characteristics (e.g., proportion, types of similarity, change patterns) of code clones in evolving software systems. Applying statistical techniques, we also made fairly accurate forecast on the proportion of code clones in the future versions of software projects. The outcome of these studies expose useful insights into the characteristics of evolving clones and their management implications. Upon identification of the code clones, their management often necessitates careful refactoring, which is dealt with at the third phase of the thesis. Given a large number of clones, it is difficult to optimally decide what to refactor and what not, especially when there are dependencies among clones and the objective remains the minimization of refactoring efforts and risks while maximizing benefits. In this regard, we developed a novel clone refactoring scheduler that applies a constraint programming approach. We also introduced a novel effort model for the estimation of efforts needed to refactor clones in source code. We evaluated our clone detector, scheduler and effort model through comparative empirical studies and user studies. Finally, based on our experience and in-depth analysis of the present state of the art, we expose avenues for further research and development towards a versatile clone management system that we envision

    Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm

    Get PDF
    International audienceIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which arises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances

    Pattern-based refactoring in model-driven engineering

    Full text link
    L’ingĂ©nierie dirigĂ©e par les modĂšles (IDM) est un paradigme du gĂ©nie logiciel qui utilise les modĂšles comme concepts de premier ordre Ă  partir desquels la validation, le code, les tests et la documentation sont dĂ©rivĂ©s. Ce paradigme met en jeu divers artefacts tels que les modĂšles, les mĂ©ta-modĂšles ou les programmes de transformation des modĂšles. Dans un contexte industriel, ces artefacts sont de plus en plus complexes. En particulier, leur maintenance demande beaucoup de temps et de ressources. Afin de rĂ©duire la complexitĂ© des artefacts et le coĂ»t de leur maintenance, de nombreux chercheurs se sont intĂ©ressĂ©s au refactoring de ces artefacts pour amĂ©liorer leur qualitĂ©. Dans cette thĂšse, nous proposons d’étudier le refactoring dans l’IDM dans sa globalitĂ©, par son application Ă  ces diffĂ©rents artefacts. Dans un premier temps, nous utilisons des patrons de conception spĂ©cifiques, comme une connaissance a priori, appliquĂ©s aux transformations de modĂšles comme un vĂ©hicule pour le refactoring. Nous procĂ©dons d’abord par une phase de dĂ©tection des patrons de conception avec diffĂ©rentes formes et diffĂ©rents niveaux de complĂ©tude. Les occurrences dĂ©tectĂ©es forment ainsi des opportunitĂ©s de refactoring qui seront exploitĂ©es pour aboutir Ă  des formes plus souhaitables et/ou plus complĂštes de ces patrons de conceptions. Dans le cas d’absence de connaissance a priori, comme les patrons de conception, nous proposons une approche basĂ©e sur la programmation gĂ©nĂ©tique, pour apprendre des rĂšgles de transformations, capables de dĂ©tecter des opportunitĂ©s de refactoring et de les corriger. Comme alternative Ă  la connaissance disponible a priori, l’approche utilise des exemples de paires d’artefacts d’avant et d’aprĂšs le refactoring, pour ainsi apprendre les rĂšgles de refactoring. Nous illustrons cette approche sur le refactoring de modĂšles.Model-Driven Engineering (MDE) is a software engineering paradigm that uses models as first-class concepts from which validation, code, testing, and documentation are derived. This paradigm involves various artifacts such as models, meta-models, or model transformation programs. In an industrial context, these artifacts are increasingly complex. In particular, their maintenance is time and resources consuming. In order to reduce the complexity of artifacts and the cost of their maintenance, many researchers have been interested in refactoring these artifacts to improve their quality. In this thesis, we propose to study refactoring in MDE holistically, by its application to these different artifacts. First, we use specific design patterns, as an example of prior knowledge, applied to model transformations to enable refactoring. We first proceed with a detecting phase of design patterns, with different forms and levels of completeness. The detected occurrences thus form refactoring opportunities that will be exploited to implement more desirable and/or more complete forms of these design patterns. In the absence of prior knowledge, such as design patterns, we propose an approach based on genetic programming, to learn transformation rules, capable of detecting refactoring opportunities and correcting them. As an alternative to prior knowledge, our approach uses examples of pairs of artifacts before and after refactoring, in order to learn refactoring rules. We illustrate this approach on model refactoring

    Programming Heterogeneous Parallel Machines Using Refactoring and Monte-Carlo Tree Search

    Get PDF
    Funding: This work was supported by the EU Horizon 2020 project, TeamPlay, Grant Number 779882, and UK EPSRC Discovery, Grant Number EP/P020631/1.This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a combination of algorithmic skeletons (such as farms and pipelines), Monte–Carlo tree search for deriving mappings of tasks to available hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scalable speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the speedups obtained by mappings derived by the MCTS algorithm are within 5–15% of the best-obtained manual parallelisation.Publisher PDFPeer reviewe

    RePOR: Mimicking humans on refactoring tasks. Are we there yet?

    Full text link
    Refactoring is a maintenance activity that aims to improve design quality while preserving the behavior of a system. Several (semi)automated approaches have been proposed to support developers in this maintenance activity, based on the correction of anti-patterns, which are `poor' solutions to recurring design problems. However, little quantitative evidence exists about the impact of automatically refactored code on program comprehension, and in which context automated refactoring can be as effective as manual refactoring. Leveraging RePOR, an automated refactoring approach based on partial order reduction techniques, we performed an empirical study to investigate whether automated refactoring code structure affects the understandability of systems during comprehension tasks. (1) We surveyed 80 developers, asking them to identify from a set of 20 refactoring changes if they were generated by developers or by a tool, and to rate the refactoring changes according to their design quality; (2) we asked 30 developers to complete code comprehension tasks on 10 systems that were refactored by either a freelancer or an automated refactoring tool. To make comparison fair, for a subset of refactoring actions that introduce new code entities, only synthetic identifiers were presented to practitioners. We measured developers' performance using the NASA task load index for their effort, the time that they spent performing the tasks, and their percentages of correct answers. Our findings, despite current technology limitations, show that it is reasonable to expect a refactoring tools to match developer code

    Search‐based model transformations

    Get PDF
    Model transformations are an important cornerstone of model‐driven engineering, a discipline which facilitates the abstraction of relevant information of a system as models. The success of the final system mainly depends on the optimization of these models through model transformations. Currently, the application of transformations is realized either by following the apply‐as‐long‐as‐possible strategy or by the provision of explicit rule orchestrations. This implies two main limitations. First, the optimization objectives are implicitly hidden in the transformation rules and their orchestration. Second, manually finding the best orchestration for a particular scenario is a major challenge due to the high number of possible combinations. To overcome these limitations, we present a novel framework that builds on the non‐intrusive integration of optimization and model transformation technologies. In particular, we formulate the transformation orchestration task as an optimization problem, which allows for the efficient exploration of the transformation space and explication of the transformation objectives. Our generic framework provides several search algorithms and guides the user in providing a proper search configuration. We present different instantiations of our framework to demonstrate its feasibility, applicability, and benefits using several case studiesEuropean Commission ICT Policy Support Programme 317859Ministerio de Economia y Competitividad TIN2015-70560-RJunta de Andalucía P10-TIC-5960Junta de Andalucía P12-TIC-186

    Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

    Get PDF
    Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.Comment: 26 page

    The 6th Conference of PhD Students in Computer Science

    Get PDF
    • 

    corecore