Spreadsheets are widely used within companies and often form the basis for
business decisions. Numerous cases are known where incorrect information in
spreadsheets has lead to incorrect decisions. Such cases underline the
relevance of research on the professional use of spreadsheets.
Recently a new dataset became available for research, containing over 15.000
business spreadsheets that were extracted from the Enron E-mail Archive. With
this dataset, we 1) aim to obtain a thorough understanding of the
characteristics of spreadsheets used within companies, and 2) compare the
characteristics of the Enron spreadsheets with the EUSES corpus which is the
existing state of the art set of spreadsheets that is frequently used in
spreadsheet studies.
Our analysis shows that 1) the majority of spreadsheets are not large in
terms of worksheets and formulas, do not have a high degree of coupling, and
their formulas are relatively simple; 2) the spreadsheets from the EUSES corpus
are, with respect to the measured characteristics, quite similar to the Enron
spreadsheets.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in
Spreadsheet