1,301 research outputs found
SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering
Version information plays an important role in spreadsheet understanding,
maintaining and quality improving. However, end users rarely use version
control tools to document spreadsheet version information. Thus, the
spreadsheet version information is missing, and different versions of a
spreadsheet coexist as individual and similar spreadsheets. Existing approaches
try to recover spreadsheet version information through clustering these similar
spreadsheets based on spreadsheet filenames or related email conversation.
However, the applicability and accuracy of existing clustering approaches are
limited due to the necessary information (e.g., filenames and email
conversation) is usually missing. We inspected the versioned spreadsheets in
VEnron, which is extracted from the Enron Corporation. In VEnron, the different
versions of a spreadsheet are clustered into an evolution group. We observed
that the versioned spreadsheets in each evolution group exhibit certain common
features (e.g., similar table headers and worksheet names). Based on this
observation, we proposed an automatic clustering algorithm, SpreadCluster.
SpreadCluster learns the criteria of features from the versioned spreadsheets
in VEnron, and then automatically clusters spreadsheets with the similar
features into the same evolution group. We applied SpreadCluster on all
spreadsheets in the Enron corpus. The evaluation result shows that
SpreadCluster could cluster spreadsheets with higher precision and recall rate
than the filename-based approach used by VEnron. Based on the clustering result
by SpreadCluster, we further created a new versioned spreadsheet corpus
VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the
other two spreadsheet corpora FUSE and EUSES. The results show that
SpreadCluster can cluster the versioned spreadsheets in these two corpora with
high precision.Comment: 12 pages, MSR 201
Automated Refactoring of Nested-IF Formulae in Spreadsheets
Spreadsheets are the most popular end-user programming software, where
formulae act like programs and also have smells. One well recognized common
smell of spreadsheet formulae is nest-IF expressions, which have low
readability and high cognitive cost for users, and are error-prone during reuse
or maintenance. However, end users usually lack essential programming language
knowledge and skills to tackle or even realize the problem. The previous
research work has made very initial attempts in this aspect, while no effective
and automated approach is currently available.
This paper firstly proposes an AST-based automated approach to systematically
refactoring nest-IF formulae. The general idea is two-fold. First, we detect
and remove logic redundancy on the AST. Second, we identify higher-level
semantics that have been fragmented and scattered, and reassemble the syntax
using concise built-in functions. A comprehensive evaluation has been conducted
against a real-world spreadsheet corpus, which is collected in a leading IT
company for research purpose. The results with over 68,000 spreadsheets with 27
million nest-IF formulae reveal that our approach is able to relieve the smell
of over 99\% of nest-IF formulae. Over 50% of the refactorings have reduced
nesting levels of the nest-IFs by more than a half. In addition, a survey
involving 49 participants indicates that for most cases the participants prefer
the refactored formulae, and agree on that such automated refactoring approach
is necessary and helpful
Business Rule Mining from Spreadsheets
Business rules represent the knowledge that guides the operations of a
business organization. They are implemented in software applications used by
organizations, and the activity of extracting them from software is known as
business rule mining. It has various purposes amongst which migration and
generating documentation are the most common. However, apart from conventional
software, organizations also use spreadsheets for a large part of their
operations and decision-making activities. Therefore we believe that
spreadsheets are also rich in business rules. We thus propose to develop an
automated system for extracting business rules from spreadsheets in a human
comprehensible natural language format. This position paper describes our
motivation, the problem description, related work, and challenges we foresee.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in
Spreadsheets (http://spreadsheetlab.org/sems15/
Copy-paste Tracking: Fixing Spreadsheets Without Breaking Them
Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone. One reason for this is that users actively rely on copy-paste to make up for the lack of abstraction mechanisms. Adding abstraction however, introduces indirection and thus cognitive distance. In this paper we propose an alternative: copy-paste tracking. Tracking copies that spreadsheet users make, allows them to directly edit copy-pasted formulas, but instead of changing only a single instance, the changes will be propagated to all formulas copied from the same source. As a result, spreadsheet users will enjoy the benefits of abstraction without its drawbacks
Copy-paste tracking: Fixing spreadsheets without breaking them
Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone. One reason for this is that users actively rely on copy-paste to make up for the lack of abstraction mechanisms. Adding abstraction however, introduces indirection and thus cognitive distance. In this paper we propose an alternative: copy-paste tracking. Tracking copies that spreadsheet users make, allows them to directly edit copy-pasted formulas, but instead of changing only a single instance, the changes will be propagated to all formulas copied from the same source. As a result, spreadsheet users will enjoy the benefits of abstraction without its drawback.FWN – Publicaties zonder aanstelling Universiteit Leide
- …