3 research outputs found
SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering
Version information plays an important role in spreadsheet understanding,
maintaining and quality improving. However, end users rarely use version
control tools to document spreadsheet version information. Thus, the
spreadsheet version information is missing, and different versions of a
spreadsheet coexist as individual and similar spreadsheets. Existing approaches
try to recover spreadsheet version information through clustering these similar
spreadsheets based on spreadsheet filenames or related email conversation.
However, the applicability and accuracy of existing clustering approaches are
limited due to the necessary information (e.g., filenames and email
conversation) is usually missing. We inspected the versioned spreadsheets in
VEnron, which is extracted from the Enron Corporation. In VEnron, the different
versions of a spreadsheet are clustered into an evolution group. We observed
that the versioned spreadsheets in each evolution group exhibit certain common
features (e.g., similar table headers and worksheet names). Based on this
observation, we proposed an automatic clustering algorithm, SpreadCluster.
SpreadCluster learns the criteria of features from the versioned spreadsheets
in VEnron, and then automatically clusters spreadsheets with the similar
features into the same evolution group. We applied SpreadCluster on all
spreadsheets in the Enron corpus. The evaluation result shows that
SpreadCluster could cluster spreadsheets with higher precision and recall rate
than the filename-based approach used by VEnron. Based on the clustering result
by SpreadCluster, we further created a new versioned spreadsheet corpus
VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the
other two spreadsheet corpora FUSE and EUSES. The results show that
SpreadCluster can cluster the versioned spreadsheets in these two corpora with
high precision.Comment: 12 pages, MSR 201
Improving Feedback from Automated Reviews of Student Spreadsheets
Spreadsheets are one of the most widely used tools for end users. As a
result, spreadsheets such as Excel are now included in many curricula. However,
digital solutions for assessing spreadsheet assignments are still scarce in the
teaching context. Therefore, we have developed an Intelligent Tutoring System
(ITS) to review students' Excel submissions and provide individualized feedback
automatically. Although the lecturer only needs to provide one reference
solution, the students' submissions are analyzed automatically in several ways:
value matching, detailed analysis of the formulas, and quality assessment of
the solution. To take the students' learning level into account, we have
developed feedback levels for an ITS that provide gradually more information
about the error by using one of the different analyses. Feedback at a higher
level has been shown to lead to a higher percentage of correct submissions and
was also perceived as well understandable and helpful by the students