Search CORE

3,043 research outputs found

Organizing the Technical Debt Landscape

Author: Antonio Vetrò
Carolyn Seaman
Clemente Izurieta
Forrest Shull
Informatics Dept
Nico Zazworka
Yuanfang Cai
Publication venue: IEEE COMPUTER SOC
Publication date: 01/01/2012
Field of study

To date, several methods and tools for detecting source code and design anomalies have been developed. While each method focuses on identifying certain classes of source code anomalies that potentially relate to technical debt (TD), the overlaps and gaps among these classes and TD have not been rigorously demonstrated. We propose to construct a seminal technical debt landscape as a way to visualize and organize research on the subjec

CiteSeerX

Crossref

Fraunhofer-ePrints

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Combining Spreadsheet Smells for Improved Fault Prediction

Author: Hofer Birgit
Jannach Dietmar
Koch Patrick
Schekotihin Konstantin
Wotawa Franz
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/05/2018
Field of study

Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.Comment: 4 pages, 1 figure, to be published in 40th International Conference on Software Engineering: New Ideas and Emerging Results Trac

arXiv.org e-Print Archive

Crossref

SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering

Author: Dou Wensheng
Gao Chushu
Huang Tao
Wang Jie
Wei Jun
Xu Liang
Zhong Hua
Publication venue
Publication date: 27/04/2017
Field of study

Version information plays an important role in spreadsheet understanding, maintaining and quality improving. However, end users rarely use version control tools to document spreadsheet version information. Thus, the spreadsheet version information is missing, and different versions of a spreadsheet coexist as individual and similar spreadsheets. Existing approaches try to recover spreadsheet version information through clustering these similar spreadsheets based on spreadsheet filenames or related email conversation. However, the applicability and accuracy of existing clustering approaches are limited due to the necessary information (e.g., filenames and email conversation) is usually missing. We inspected the versioned spreadsheets in VEnron, which is extracted from the Enron Corporation. In VEnron, the different versions of a spreadsheet are clustered into an evolution group. We observed that the versioned spreadsheets in each evolution group exhibit certain common features (e.g., similar table headers and worksheet names). Based on this observation, we proposed an automatic clustering algorithm, SpreadCluster. SpreadCluster learns the criteria of features from the versioned spreadsheets in VEnron, and then automatically clusters spreadsheets with the similar features into the same evolution group. We applied SpreadCluster on all spreadsheets in the Enron corpus. The evaluation result shows that SpreadCluster could cluster spreadsheets with higher precision and recall rate than the filename-based approach used by VEnron. Based on the clustering result by SpreadCluster, we further created a new versioned spreadsheet corpus VEnron2, which is much bigger than VEnron. We also applied SpreadCluster on the other two spreadsheet corpora FUSE and EUSES. The results show that SpreadCluster can cluster the versioned spreadsheets in these two corpora with high precision.Comment: 12 pages, MSR 201

arXiv.org e-Print Archive

Crossref

Enron versus EUSES: A Comparison of Two Spreadsheet Corpora

Author: Jansen Bas
Publication venue
Publication date: 13/03/2015
Field of study

Spreadsheets are widely used within companies and often form the basis for business decisions. Numerous cases are known where incorrect information in spreadsheets has lead to incorrect decisions. Such cases underline the relevance of research on the professional use of spreadsheets. Recently a new dataset became available for research, containing over 15.000 business spreadsheets that were extracted from the Enron E-mail Archive. With this dataset, we 1) aim to obtain a thorough understanding of the characteristics of spreadsheets used within companies, and 2) compare the characteristics of the Enron spreadsheets with the EUSES corpus which is the existing state of the art set of spreadsheets that is frequently used in spreadsheet studies. Our analysis shows that 1) the majority of spreadsheets are not large in terms of worksheets and formulas, do not have a high degree of coupling, and their formulas are relatively simple; 2) the spreadsheets from the EUSES corpus are, with respect to the measured characteristics, quite similar to the Enron spreadsheets.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheet

arXiv.org e-Print Archive

CiteSeerX

TU Delft Repository

FigShare

An analysis of techniques and methods for technical debt management: a reflection from the architecture perspective

Author: Fernández-Sánchez Carlos
Garbajosa Sopeña Juan
Vidal Varo Carlos
Yagüe Panadero Agustín
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Technical debt is a metaphor referring to the consequences of weak software development. Managing technical debt is necessary in order to keep it under control, and several techniques have been developed with the goal of accomplishing this. However, available techniques have grown disperse and managers lack guidance. This paper covers this gap by providing a systematic mapping of available techniques and methods for technical debt management, covering architectural debt, and identifying existing gaps that prevent to manage technical debt efficiently

Crossref

Archivo Digital UPM