215 research outputs found

    Automated Refactoring of Nested-IF Formulae in Spreadsheets

    Full text link
    Spreadsheets are the most popular end-user programming software, where formulae act like programs and also have smells. One well recognized common smell of spreadsheet formulae is nest-IF expressions, which have low readability and high cognitive cost for users, and are error-prone during reuse or maintenance. However, end users usually lack essential programming language knowledge and skills to tackle or even realize the problem. The previous research work has made very initial attempts in this aspect, while no effective and automated approach is currently available. This paper firstly proposes an AST-based automated approach to systematically refactoring nest-IF formulae. The general idea is two-fold. First, we detect and remove logic redundancy on the AST. Second, we identify higher-level semantics that have been fragmented and scattered, and reassemble the syntax using concise built-in functions. A comprehensive evaluation has been conducted against a real-world spreadsheet corpus, which is collected in a leading IT company for research purpose. The results with over 68,000 spreadsheets with 27 million nest-IF formulae reveal that our approach is able to relieve the smell of over 99\% of nest-IF formulae. Over 50% of the refactorings have reduced nesting levels of the nest-IFs by more than a half. In addition, a survey involving 49 participants indicates that for most cases the participants prefer the refactored formulae, and agree on that such automated refactoring approach is necessary and helpful

    Refactoring meets model-driven spreadsheet evolution

    Get PDF
    Software refactoring is a well-known technique that provides transformations on software artifacts with the aim of improving their overall quality. In this paper we present a set of refactorings for ClassSheets, a modeling language that allows to specify the business logic of a spreadsheet in an object-oriented fashion. The set of refactorings that we propose allows us to improve the quality of these spreadsheet models. Moreover, it is implemented in a setting that guarantees that all model refactorings are automatically carried to all the corresponding (spreadsheet) instances, thus providing an automatic evolution of the data so it is always synchronized with the model

    Refactoring smelly spreadsheet models

    Get PDF
    Identifying bad design patterns in software is a successful and inspiring research trend. While these patterns do not necessarily correspond to software errors, the fact is that they raise potential problematic issues, often referred to as code smells, and that can for example compromise maintainability or evolution. The identification of code smells in spreadsheets, which can be viewed as software development environments for non-professional programmers, has already been the subject of confluent researches by different groups. While these research groups have focused on detecting smells on concrete spreadsheets, or spreadsheet instances, in this paper we propose a comprehensive set of smells for abstract representations of spreadsheets, or spreadsheet models. We also propose a set of refactorings suggesting how spreadsheet models can become simpler to understand, manipulate and evolve. Finally we present the integration of both smells and refactorings under the MDSheet framework.Part funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundação para a Ciência e a Tecnologia within projects FCOMP-01-0124-FEDER-022701 and Network Sensing for Critical Systems Monitoring (NORTE-01-0124-FEDER-000058), ref. BIM-2013 BestCase RL3.2 UMINHO

    SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering

    Full text link
    Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheet version information. Thus, the spreadsheet version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email conversation. However, the applicability and accuracy of existing clustering approaches are limited due to the necessary information (e.g., filenames and email conversation) is usually missing. We inspected the versioned spreadsheets in VEnron, which is extracted from the Enron Corporation. In VEnron, the different versions of a spreadsheet are clustered into an evolution group. We observed that the versioned spreadsheets in each evolution group exhibit certain common features (e.g., similar table headers and worksheet names). Based on this observation, we proposed an automatic clustering algorithm, SpreadCluster. SpreadCluster learns the criteria of features from the versioned spreadsheets in VEnron, and then automatically clusters spreadsheets with the similar features into the same evolution group. We applied SpreadCluster on all spreadsheets in the Enron corpus. The evaluation result shows that SpreadCluster could cluster spreadsheets with higher precision and recall rate than the filename-based approach used by VEnron. Based on the clustering result by SpreadCluster, we further created a new versioned spreadsheet corpus VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the other two spreadsheet corpora FUSE and EUSES. The results show that SpreadCluster can cluster the versioned spreadsheets in these two corpora with high precision.Comment: 12 pages, MSR 201

    Gradual structuring: Evolving the spreadsheet paradigm for expressiveness and learnability

    Full text link
    © 2016 IEEE. Spreadsheets are arguably the most used form of programming and are frequently used in higher education to teach fundamental concepts about computation. Their success has shown that they are simple enough for a huge number of end users to learn and use. This is in contrast to traditional programming languages and the high dropout rate from introductory programming and computer science. However in comparison to traditional programming languages and structured modelling, spreadsheets are not expressive, placing a limit on the levels of computational thinking that can be taught using the spreadsheet paradigm. This limitation is imposed by the lack of programming language features and abstractions in the paradigm. Furthermore, more advanced spreadsheet features (e.g. array formulae, lookup formulae, R1C1 syntax) can be difficult to learn and use. This paper discusses the idea of adding language features to spreadsheets, enabling the gradual structuring of free-form spreadsheets to more structured models. We propose that this concept is termed Gradual Structuring, and is analogous to the programming language concept of gradual typing. In this analogy, spreadsheets take the place of dynamic programming and structured modelling of static programming. In programming languages, gradual typing allows dynamic programming to be mixed with static programming. It is our contention that dynamic programming is more learnable while static programming is more expressive and abstract. Gradual typing could be used to mitigate the issues in the teaching of traditional programming. Likewise Gradual Structuring can mitigate the conceptual limits that can be taught using current spreadsheets. The key language feature required to enable Gradual Structuring is the ability to logically group cells together so that a single formula can be applied to the grouped cells. This concept, termed cell grouping diminishes and can even eliminate the need for the ubiquitous and error-prone use of copy-pasted in spreadsheets. Moreover, it makes the structure present in spreadsheet models explicit. Cell grouping requires a cascade of other new languages features. Namely a more expressive referencing style, which in turned requires enabling labels to be moved to the row and column headers, and the hierarchical structuring of these headers. Respectively these language features are termed enhanced referencing and semantic axes. The ongoing research focusses on the usability and learnability of these language features. Spreadsheet applications exist that contain aspects of the features mentioned. However these applications do not enable Gradual Structuring and have taken a mainly technical, not human behavioural, approach to evolving the spreadsheet

    FaultySheet detective: when smells meet fault localization

    Get PDF
    This paper presents a tool, dubbed FaultySheet Detective, for aiding in spreadsheet fault localization, which combines the detection of bad smells with a generic spectrum-based fault localization algorithm

    Towards the design and implementation of aspect-oriented programming for spreadsheets

    Get PDF
    A spreadsheet usually starts as a simple and singleuser software artifact, but, as frequent as in other software systems, quickly evolves into a complex system developed by many actors. Often, different users work on different aspects of the same spreadsheet: while a secretary may be only involved in adding plain data to the spreadsheet, an accountant may define new business rules, while an engineer may need to adapt the spreadsheet content so it can be used by other software systems.Unfortunately,spreadsheetsystemsdonotoffermodular mechanisms, and as a consequence, some of the previous tasks may be defined by adding intrusive “code” to the spreadsheet. In this paper we go through the design and implementation of an aspect-oriented language for spreadsheets so that users can work on different aspects of a spreadsheet in a modular way. For example, aspects can be defined in order to introduce new business rules to an existing spreadsheet, or to manipulate the spreadsheet data to be ported to another system. Aspects are defined as aspect-oriented program specifications that are dynamically woven into the underlying spreadsheet by an aspect weaver. In this aspect-oriented style of spreadsheet development, differentusers develop,orreuse,aspects withoutaddingintrusive code to the original spreadsheet. Such code is added/executed by the spreadsheet weaving mechanism proposed in this paper.This work is financed by the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within project UID/EEA/50014/2013. This work has also been partially funded by FLAD/NSF through a project grant (ref. 233/2014). The last author is supported by CAPES through a Programa Professor Visitante do Exterior (PVE) grant (ref. 15075133)
    corecore