53 research outputs found
Automated Refactoring of Nested-IF Formulae in Spreadsheets
Spreadsheets are the most popular end-user programming software, where
formulae act like programs and also have smells. One well recognized common
smell of spreadsheet formulae is nest-IF expressions, which have low
readability and high cognitive cost for users, and are error-prone during reuse
or maintenance. However, end users usually lack essential programming language
knowledge and skills to tackle or even realize the problem. The previous
research work has made very initial attempts in this aspect, while no effective
and automated approach is currently available.
This paper firstly proposes an AST-based automated approach to systematically
refactoring nest-IF formulae. The general idea is two-fold. First, we detect
and remove logic redundancy on the AST. Second, we identify higher-level
semantics that have been fragmented and scattered, and reassemble the syntax
using concise built-in functions. A comprehensive evaluation has been conducted
against a real-world spreadsheet corpus, which is collected in a leading IT
company for research purpose. The results with over 68,000 spreadsheets with 27
million nest-IF formulae reveal that our approach is able to relieve the smell
of over 99\% of nest-IF formulae. Over 50% of the refactorings have reduced
nesting levels of the nest-IFs by more than a half. In addition, a survey
involving 49 participants indicates that for most cases the participants prefer
the refactored formulae, and agree on that such automated refactoring approach
is necessary and helpful
Gradual structuring: Evolving the spreadsheet paradigm for expressiveness and learnability
© 2016 IEEE. Spreadsheets are arguably the most used form of programming and are frequently used in higher education to teach fundamental concepts about computation. Their success has shown that they are simple enough for a huge number of end users to learn and use. This is in contrast to traditional programming languages and the high dropout rate from introductory programming and computer science. However in comparison to traditional programming languages and structured modelling, spreadsheets are not expressive, placing a limit on the levels of computational thinking that can be taught using the spreadsheet paradigm. This limitation is imposed by the lack of programming language features and abstractions in the paradigm. Furthermore, more advanced spreadsheet features (e.g. array formulae, lookup formulae, R1C1 syntax) can be difficult to learn and use. This paper discusses the idea of adding language features to spreadsheets, enabling the gradual structuring of free-form spreadsheets to more structured models. We propose that this concept is termed Gradual Structuring, and is analogous to the programming language concept of gradual typing. In this analogy, spreadsheets take the place of dynamic programming and structured modelling of static programming. In programming languages, gradual typing allows dynamic programming to be mixed with static programming. It is our contention that dynamic programming is more learnable while static programming is more expressive and abstract. Gradual typing could be used to mitigate the issues in the teaching of traditional programming. Likewise Gradual Structuring can mitigate the conceptual limits that can be taught using current spreadsheets. The key language feature required to enable Gradual Structuring is the ability to logically group cells together so that a single formula can be applied to the grouped cells. This concept, termed cell grouping diminishes and can even eliminate the need for the ubiquitous and error-prone use of copy-pasted in spreadsheets. Moreover, it makes the structure present in spreadsheet models explicit. Cell grouping requires a cascade of other new languages features. Namely a more expressive referencing style, which in turned requires enabling labels to be moved to the row and column headers, and the hierarchical structuring of these headers. Respectively these language features are termed enhanced referencing and semantic axes. The ongoing research focusses on the usability and learnability of these language features. Spreadsheet applications exist that contain aspects of the features mentioned. However these applications do not enable Gradual Structuring and have taken a mainly technical, not human behavioural, approach to evolving the spreadsheet
Visualising formula structures to support exploratory modelling
Visualisation is often presented as a means of simplifying information and helping people understand complex data. In this paper we describe a project designing interactive visualisations to support core learner competencies in the broad area of numeracy. The work builds upon: (i) the observation that while spreadsheets are traditional ICT tools, their widespread use means that they are often introduced as a means of exploring basic mathematical modelling; (ii) a research theme examining the human factors that influence the ease with which formal notations can be understood and applied appropriately. Our paper describes the iterative design and evaluation of a tool to visualise spreadsheets, with the aim of supporting mid-teen learners based on the premise that spreadsheets serve as a gateway tool for supporting learner experimentation and confidence within numerate subjects. This iterative process is informed by background research into notational design, graphic design as well as learner and tutor feedback
Recommended from our members
The Lish: A Data Model for Grid Free Spreadsheets
Throughout the history of the spreadsheet, and throughout the majority of research into improving it, the grid of cells has remained a constant as the underlying data model. An idea that has received recent interest is to provide users with a spreadsheet-like environment based on something other than a grid. The attraction is that if salient features of the data structure can be made more explicit, the machine will be able to provide certain types of error checking and automation.
In this project I consider one such grid replacement, a new data model which I call the âlishâ. It is based on nested lists of cells, composed according to rules that allow repeating structures to be described. It allows columns, tables, groups of tables and other structures to be treated as coherent objects. This supports a novel form of cell range selection, and allows the machine to ensure that related structures are kept consistent. The model is also more accommodating than the grid of dynamic space allocation, where the number of cells occupied by a result is not known in advance.
Then, I develop a âlish calculusâ, an extension to vector arithmetic for hierarchical structures that provides a concise notation for calculations with lishes. This simplifies the usual spreadsheet formula expressions, and enables the machine to interpret them consistently with the context in which they are located.
I evaluate the lish in the framework of the cognitive dimensions of notations, with the help of example use cases and a user study based on a prototype lish editor. These verify many of the hypothesised advantages, but also reveal some difficulties for users. I close with an analysis of how the lish might be revised to address these shortcomings, while continuing to capitalise on the essential benefits
30 Years of Software Refactoring Research: A Systematic Literature Review
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd
30 Years of Software Refactoring Research:A Systematic Literature Review
Due to the growing complexity of software systems, there has been a dramatic
increase and industry demand for tools and techniques on software refactoring
in the last ten years, defined traditionally as a set of program
transformations intended to improve the system design while preserving the
behavior. Refactoring studies are expanded beyond code-level restructuring to
be applied at different levels (architecture, model, requirements, etc.),
adopted in many domains beyond the object-oriented paradigm (cloud computing,
mobile, web, etc.), used in industrial settings and considered objectives
beyond improving the design to include other non-functional requirements (e.g.,
improve performance, security, etc.). Thus, challenges to be addressed by
refactoring work are, nowadays, beyond code transformation to include, but not
limited to, scheduling the opportune time to carry refactoring, recommendations
of specific refactoring activities, detection of refactoring opportunities, and
testing the correctness of applied refactorings. Therefore, the refactoring
research efforts are fragmented over several research communities, various
domains, and objectives. To structure the field and existing research results,
this paper provides a systematic literature review and analyzes the results of
3183 research papers on refactoring covering the last three decades to offer
the most scalable and comprehensive literature review of existing refactoring
research studies. Based on this survey, we created a taxonomy to classify the
existing research, identified research trends, and highlighted gaps in the
literature and avenues for further research.Comment: 23 page
The Larch Environment - Python programs as visual, interactive literature
The Larch Environment' is designed for the creation of programs that take the
form of interactive technical literature. We introduce a novel approach to combined
textual and visual programming by allowing visual, interactive objects
to be embedded within textual source code, and segments of source code to be
further embedded within those objects. We retain the strengths of text-based
source code, while enabling visual programming where it is beneïżœcial. Additionally,
embedded objects and code provide a simple object-oriented approach
to extending the syntax of a language, in a similar fashion to LISP macros. We
provide a rapid prototyping and experimentation environment in the form of
an active document system which mixes rich text with executable source code.
Larch is supported by a simple type coercion based presentation protocol that
displays normal Java and Python objects in a visual, interactive form. The
ability to freely combine objects and source code within one another allows for
the construction of rich interactive documents and experimentation with novel
programming language extensions
User driven modelling: Visualisation and systematic interaction for end-user programming with tree-based structures
This thesis addresses certain problems encountered by teams of engineers when modelling complex structures and processes subject to cost and other resource constraints. The cost of a structure or process may be âread offâ its specifying model, but the language in which the model is expressed (e.g. CAD) and the language in which resources may be modelled (e.g. spreadsheets) are not naturally compatible. This thesis demonstrates that a number of intermediate steps may be introduced which enable both meaningful translation from one conceptual view to another as well as meaningful collaboration between team members. The work adopts a diagrammatic modelling approach as a natural one in an engineering context when seeking to establish a shared understanding of problems.Thus, the research question to be answered in this thesis is: âTo what extent is it possible to improve user-driven software development through interaction with diagrams and without requiring users to learn particular computer languages?â The goal of the research is to improve collaborative software development through interaction with diagrams, thereby minimising the need for end-users to code directly. To achieve this aim a combination of the paradigms of End-User Programming, Process and Product Modelling and Decision Support, and Semantic Web are exploited and a methodology of User Driven Modelling and Programming (UDM/P) is developed, implemented, and tested as a means of demonstrating the efficacy of diagrammatic modelling.In greater detail, the research seeks to show that diagrammatic modelling eases problems of maintenance, extensibility, ease of use, and sharing of information. The methodology presented here to achieve this involves a three step translation from a visualised ontology, through a modelling tool, to output to interactive visualisations. An analysis of users groups them into categories of system creator, model builder, and model user. This categorisation corresponds well with the three-step translation process where users develop the ontology, modelling tool, and visualisations for their problem.This research establishes and exemplifies a novel paradigm of collaborative end-user programming by domain experts. The end-user programmers can use a visual interface where the visualisation of the software exactly matches the structure of the software itself, making translation between user and computer, and vice versa, much more direct and practical. The visualisation is based on an ontology that provides a representation of the software as a tree. The solution is based on translation from a source tree to a result tree, and visualisation of both. The result tree shows a structured representation of the model with a full visualisation of all parts that leads to the computed result.In conclusion, it is claimed that this direct representation of the structure enables an understanding of the program as an ontology and model that is then visualised, resulting in a more transparent shared understanding by all users. It is further argued that our diagrammatic modelling paradigm consequently eases problems of maintenance, extensibility, ease of use, and sharing of information. This method is applicable to any problem that lends itself to representation as a tree. This is considered a limitation of the method to be addressed in a future project
- âŠ