Search CORE

53 research outputs found

Automated Refactoring of Nested-IF Formulae in Spreadsheets

Author: Han Shi
Hao Dan
Zhang Dongmei
Zhang Jie
Zhang Lu
Publication venue
Publication date: 28/12/2017
Field of study

Spreadsheets are the most popular end-user programming software, where formulae act like programs and also have smells. One well recognized common smell of spreadsheet formulae is nest-IF expressions, which have low readability and high cognitive cost for users, and are error-prone during reuse or maintenance. However, end users usually lack essential programming language knowledge and skills to tackle or even realize the problem. The previous research work has made very initial attempts in this aspect, while no effective and automated approach is currently available. This paper firstly proposes an AST-based automated approach to systematically refactoring nest-IF formulae. The general idea is two-fold. First, we detect and remove logic redundancy on the AST. Second, we identify higher-level semantics that have been fragmented and scattered, and reassemble the syntax using concise built-in functions. A comprehensive evaluation has been conducted against a real-world spreadsheet corpus, which is collected in a leading IT company for research purpose. The results with over 68,000 spreadsheets with 27 million nest-IF formulae reveal that our approach is able to relieve the smell of over 99\% of nest-IF formulae. Over 50% of the refactorings have reduced nesting levels of the nest-IFs by more than a half. In addition, a survey involving 49 participants indicates that for most cases the participants prefer the refactored formulae, and agree on that such automated refactoring approach is necessary and helpful

arXiv.org e-Print Archive

Crossref

Gradual structuring: Evolving the spreadsheet paradigm for expressiveness and learnability

Author: Braun R
Hermans F
Miller G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/11/2016
Field of study

© 2016 IEEE. Spreadsheets are arguably the most used form of programming and are frequently used in higher education to teach fundamental concepts about computation. Their success has shown that they are simple enough for a huge number of end users to learn and use. This is in contrast to traditional programming languages and the high dropout rate from introductory programming and computer science. However in comparison to traditional programming languages and structured modelling, spreadsheets are not expressive, placing a limit on the levels of computational thinking that can be taught using the spreadsheet paradigm. This limitation is imposed by the lack of programming language features and abstractions in the paradigm. Furthermore, more advanced spreadsheet features (e.g. array formulae, lookup formulae, R1C1 syntax) can be difficult to learn and use. This paper discusses the idea of adding language features to spreadsheets, enabling the gradual structuring of free-form spreadsheets to more structured models. We propose that this concept is termed Gradual Structuring, and is analogous to the programming language concept of gradual typing. In this analogy, spreadsheets take the place of dynamic programming and structured modelling of static programming. In programming languages, gradual typing allows dynamic programming to be mixed with static programming. It is our contention that dynamic programming is more learnable while static programming is more expressive and abstract. Gradual typing could be used to mitigate the issues in the teaching of traditional programming. Likewise Gradual Structuring can mitigate the conceptual limits that can be taught using current spreadsheets. The key language feature required to enable Gradual Structuring is the ability to logically group cells together so that a single formula can be applied to the grouped cells. This concept, termed cell grouping diminishes and can even eliminate the need for the ubiquitous and error-prone use of copy-pasted in spreadsheets. Moreover, it makes the structure present in spreadsheet models explicit. Cell grouping requires a cascade of other new languages features. Namely a more expressive referencing style, which in turned requires enabling labels to be moved to the row and column headers, and the hierarchical structuring of these headers. Respectively these language features are termed enhanced referencing and semantic axes. The ongoing research focusses on the usability and learnability of these language features. Spreadsheet applications exist that contain aspects of the features mentioned. However these applications do not enable Gradual Structuring and have taken a mainly technical, not human behavioural, approach to evolving the spreadsheet

Crossref

OPUS - University of Technology Sydney

Visualising formula structures to support exploratory modelling

Author: Gunning Michael
Leitao Roxanne
Roast Chris
Publication venue
Publication date: 01/01/2016
Field of study

Visualisation is often presented as a means of simplifying information and helping people understand complex data. In this paper we describe a project designing interactive visualisations to support core learner competencies in the broad area of numeracy. The work builds upon: (i) the observation that while spreadsheets are traditional ICT tools, their widespread use means that they are often introduced as a means of exploring basic mathematical modelling; (ii) a research theme examining the human factors that influence the ease with which formal notations can be understood and applied appropriately. Our paper describes the iterative design and evaluation of a tool to visualise spreadsheets, with the aim of supporting mid-teen learners based on the premise that spreadsheets serve as a gateway tool for supporting learner experimentation and confidence within numerate subjects. This iterative process is informed by background research into notational design, graphic design as well as learner and tutor feedback

Sheffield Hallam University Research Archive

Recommended from our members

The Lish: A Data Model for Grid Free Spreadsheets

Author: Hall Alan Geoffrey
Publication venue
Publication date: 28/11/2019
Field of study

Throughout the history of the spreadsheet, and throughout the majority of research into improving it, the grid of cells has remained a constant as the underlying data model. An idea that has received recent interest is to provide users with a spreadsheet-like environment based on something other than a grid. The attraction is that if salient features of the data structure can be made more explicit, the machine will be able to provide certain types of error checking and automation. In this project I consider one such grid replacement, a new data model which I call the “lish”. It is based on nested lists of cells, composed according to rules that allow repeating structures to be described. It allows columns, tables, groups of tables and other structures to be treated as coherent objects. This supports a novel form of cell range selection, and allows the machine to ensure that related structures are kept consistent. The model is also more accommodating than the grid of dynamic space allocation, where the number of cells occupied by a result is not known in advance. Then, I develop a “lish calculus”, an extension to vector arithmetic for hierarchical structures that provides a concise notation for calculations with lishes. This simplifies the usual spreadsheet formula expressions, and enables the machine to interpret them consistently with the context in which they are located. I evaluate the lish in the framework of the cognitive dimensions of notations, with the help of example use cases and a user study based on a prototype lish editor. These verify many of the hypothesised advantages, but also reveal some difficulties for users. I close with an analysis of how the lish might be revised to address these shortcomings, while continuing to capitalise on the essential benefits

Open Research Online (The Open University)

30 Years of Software Refactoring Research: A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Ferreira Thiago
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd

Deep Blue Documents at the University of Michigan

30 Years of Software Refactoring Research:A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Dig Danny
Ferreira Thiago do Nascimento
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research.Comment: 23 page

arXiv.org e-Print Archive

Deep Blue Documents at the University of Michigan

Calculation View: multiple-representation editing in spreadsheets

Author: Gordon Andrew D.
Jones Simon Peyton
Sarkar Advait
Toronto Neil
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/10/2018
Field of study

Crossref

Edinburgh Research Explorer

The Larch Environment - Python programs as visual, interactive literature

Author: French Geoffrey
Publication venue
Publication date: 01/01/2013
Field of study

The Larch Environment' is designed for the creation of programs that take the form of interactive technical literature. We introduce a novel approach to combined textual and visual programming by allowing visual, interactive objects to be embedded within textual source code, and segments of source code to be further embedded within those objects. We retain the strengths of text-based source code, while enabling visual programming where it is bene�cial. Additionally, embedded objects and code provide a simple object-oriented approach to extending the syntax of a language, in a similar fashion to LISP macros. We provide a rapid prototyping and experimentation environment in the form of an active document system which mixes rich text with executable source code. Larch is supported by a simple type coercion based presentation protocol that displays normal Java and Python objects in a visual, interactive form. The ability to freely combine objects and source code within one another allows for the construction of rich interactive documents and experimentation with novel programming language extensions

University of East Anglia digital repository

User driven modelling: Visualisation and systematic interaction for end-user programming with tree-based structures

Author: Hale Peter
Publication venue
Publication date
Field of study

This thesis addresses certain problems encountered by teams of engineers when modelling complex structures and processes subject to cost and other resource constraints. The cost of a structure or process may be ‘read off’ its specifying model, but the language in which the model is expressed (e.g. CAD) and the language in which resources may be modelled (e.g. spreadsheets) are not naturally compatible. This thesis demonstrates that a number of intermediate steps may be introduced which enable both meaningful translation from one conceptual view to another as well as meaningful collaboration between team members. The work adopts a diagrammatic modelling approach as a natural one in an engineering context when seeking to establish a shared understanding of problems.Thus, the research question to be answered in this thesis is: ‘To what extent is it possible to improve user-driven software development through interaction with diagrams and without requiring users to learn particular computer languages?’ The goal of the research is to improve collaborative software development through interaction with diagrams, thereby minimising the need for end-users to code directly. To achieve this aim a combination of the paradigms of End-User Programming, Process and Product Modelling and Decision Support, and Semantic Web are exploited and a methodology of User Driven Modelling and Programming (UDM/P) is developed, implemented, and tested as a means of demonstrating the efficacy of diagrammatic modelling.In greater detail, the research seeks to show that diagrammatic modelling eases problems of maintenance, extensibility, ease of use, and sharing of information. The methodology presented here to achieve this involves a three step translation from a visualised ontology, through a modelling tool, to output to interactive visualisations. An analysis of users groups them into categories of system creator, model builder, and model user. This categorisation corresponds well with the three-step translation process where users develop the ontology, modelling tool, and visualisations for their problem.This research establishes and exemplifies a novel paradigm of collaborative end-user programming by domain experts. The end-user programmers can use a visual interface where the visualisation of the software exactly matches the structure of the software itself, making translation between user and computer, and vice versa, much more direct and practical. The visualisation is based on an ontology that provides a representation of the software as a tree. The solution is based on translation from a source tree to a result tree, and visualisation of both. The result tree shows a structured representation of the model with a full visualisation of all parts that leads to the computed result.In conclusion, it is claimed that this direct representation of the structure enables an understanding of the program as an ontology and model that is then visualised, resulting in a more transparent shared understanding by all users. It is further argued that our diagrammatic modelling paradigm consequently eases problems of maintenance, extensibility, ease of use, and sharing of information. This method is applicable to any problem that lends itself to representation as a tree. This is considered a limitation of the method to be addressed in a future project

UWE Bristol Research Repository