Search CORE

51 research outputs found

Structured Spreadsheet Modeling and Implementation

Author: Mireault Paul
Publication venue
Publication date: 05/06/2015
Field of study

Developing an error-free spreadsheet has been a problem since the beginning of end-user computing. In this paper, we present a methodology that separates the modeling from the implementation. Using proven techniques from Information Systems and Software Engineering, we present strict, but simple, rules governing the implementation from the model. The resulting spreadsheet should be easier to understand, audit and maintain.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheet

arXiv.org e-Print Archive

CiteSeerX

Static Partitioning of Spreadsheets for Parallel Execution

Author: D Cann
D Leijen
F Biermann
P Sestoft
V Sarkar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/12/2018
Field of study

Crossref

The IT University of Copenhagen's Repository

Enron versus EUSES: A Comparison of Two Spreadsheet Corpora

Author: Jansen Bas
Publication venue
Publication date: 13/03/2015
Field of study

Spreadsheets are widely used within companies and often form the basis for business decisions. Numerous cases are known where incorrect information in spreadsheets has lead to incorrect decisions. Such cases underline the relevance of research on the professional use of spreadsheets. Recently a new dataset became available for research, containing over 15.000 business spreadsheets that were extracted from the Enron E-mail Archive. With this dataset, we 1) aim to obtain a thorough understanding of the characteristics of spreadsheets used within companies, and 2) compare the characteristics of the Enron spreadsheets with the EUSES corpus which is the existing state of the art set of spreadsheets that is frequently used in spreadsheet studies. Our analysis shows that 1) the majority of spreadsheets are not large in terms of worksheets and formulas, do not have a high degree of coupling, and their formulas are relatively simple; 2) the spreadsheets from the EUSES corpus are, with respect to the measured characteristics, quite similar to the Enron spreadsheets.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheet

arXiv.org e-Print Archive

CiteSeerX

TU Delft Repository

FigShare

Rewriting High-Level Spreadsheet Structures into Higher-Order Functional Programs

Author: Biermann Florian
Dou Wensheng
Sestoft Peter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/12/2017
Field of study

Crossref

The IT University of Copenhagen's Repository

SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering

Author: Dou Wensheng
Gao Chushu
Huang Tao
Wang Jie
Wei Jun
Xu Liang
Zhong Hua
Publication venue
Publication date: 27/04/2017
Field of study

Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheet version information. Thus, the spreadsheet version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email conversation. However, the applicability and accuracy of existing clustering approaches are limited due to the necessary information (e.g., filenames and email conversation) is usually missing. We inspected the versioned spreadsheets in VEnron, which is extracted from the Enron Corporation. In VEnron, the different versions of a spreadsheet are clustered into an evolution group. We observed that the versioned spreadsheets in each evolution group exhibit certain common features (e.g., similar table headers and worksheet names). Based on this observation, we proposed an automatic clustering algorithm, SpreadCluster. SpreadCluster learns the criteria of features from the versioned spreadsheets in VEnron, and then automatically clusters spreadsheets with the similar features into the same evolution group. We applied SpreadCluster on all spreadsheets in the Enron corpus. The evaluation result shows that SpreadCluster could cluster spreadsheets with higher precision and recall rate than the filename-based approach used by VEnron. Based on the clustering result by SpreadCluster, we further created a new versioned spreadsheet corpus VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the other two spreadsheet corpora FUSE and EUSES. The results show that SpreadCluster can cluster the versioned spreadsheets in these two corpora with high precision.Comment: 12 pages, MSR 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Spreadsheet Explanation Through Table Abstraction

Author: Dan Mihai
Publication venue: 'Oregon State University'
Publication date
Field of study

Spreadsheets are a pervasive technology throughout personal and industrial use. Often times, the user is not the author, contributing to a lack of understanding of the purpose and functionality of a spreadsheet. Furthermore, the lack of understanding is a major reason for mistakes in the use and maintenance of spreadsheets. I present an approach, called explanation sheets, which eases the understanding and maintenance of spreadsheets. I identify the notion of explanation soundness and show that explanation sheets which conform to simple rules of formula convergence provide sound explanations. I also present a practical evaluation of explanation sheets based on samples drawn from widely used spreadsheet corpora and based on a small user study. In addition to facilitating the understanding of spreadsheets, I describe the process of inferring explanation sheets from a spreadsheet. By means of assessing example spreadsheets, I present a set of inference rules to describe the relationship between a spreadsheet and its explanation

ScholarsArchive@OSU