2,851 research outputs found

    Business Rule Mining from Spreadsheets

    Full text link
    Business rules represent the knowledge that guides the operations of a business organization. They are implemented in software applications used by organizations, and the activity of extracting them from software is known as business rule mining. It has various purposes amongst which migration and generating documentation are the most common. However, apart from conventional software, organizations also use spreadsheets for a large part of their operations and decision-making activities. Therefore we believe that spreadsheets are also rich in business rules. We thus propose to develop an automated system for extracting business rules from spreadsheets in a human comprehensible natural language format. This position paper describes our motivation, the problem description, related work, and challenges we foresee.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheets (http://spreadsheetlab.org/sems15/

    Toward Reverse Engineering of VBA Based Excel Spreadsheet Applications

    Get PDF
    Modern spreadsheet systems can be used to implement complex spreadsheet applications including data sheets, customized user forms and executable procedures written in a scripting language. These applications are often developed by practitioners that do not follow any software engineering practice and do not produce any design documentation. Thus, spreadsheet applications may be very difficult to be maintained or restructured. In this position paper we present in a nutshell two reverse engineering techniques and a tool that we are currently realizing for the abstraction of conceptual data models and business logic models.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheets (http://spreadsheetlab.org/sems15/

    Model inference for spreadsheets

    Get PDF
    Many errors in spreadsheet formulas can be avoided if spreadsheets are built automati- cally from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is dificult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: First, we have demonstrated its appli- cability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not t well spreadsheets as they do not allow to express layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it lls an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.We would like to thank Orlando Belo for his help on running and analyzing the empirical study. We would also like to thank Paulo Azevedo for his help in conducting the statistical analysis of our empirical study. We would also like to thank the anonymous reviewers for their suggestions which helped us to improve the paper. This work is funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author was also supported by FCT grant SFRH/BPD/73358/2010

    Automatically inferring ClassSheet models from spreadsheets

    Get PDF
    Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints. However, designing such models is time consuming and requires expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets pose a particular challenge to the approach of controlling spreadsheet evolution through higher-level models, because the need for a model might be overshadowed by two problems: (A) The benefit of creating a spreadsheet is lacking since the legacy spreadsheet already exists, and (B) existing data must be transferred into the new model-generated spreadsheet.To address these problems and to support the modeldriven spreadsheet engineering approach, we have developed a tool that can automatically infer ClassSheet models from spreadsheets. To this end, we have adapted a method to infer entity/relationship models from relational database to the spreadsheets/ClassSheets realm. We have implemented our techniques in the HAEXCEL framework and integrated it with the ViTSL/Gencel spreadsheet generator, which allows the automatic generation of refactored spreadsheets from the inferred ClassSheet model. The resulting spreadsheet guides further changes and provably safeguards the spreadsheet against a large class of formula errors. The developed tool is a significant contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.(undefined

    Information Extraction on Para-Relational Data.

    Full text link
    Para-relational data (such as spreadsheets and diagrams) refers to a type of nearly relational data that shares the important qualities of relational data but does not present itself in a relational format. Para-relational data often conveys highly valuable information and is widely used in many different areas. If we can convert para-relational data into the relational format, many existing tools can be leveraged for a variety of interesting applications, such as data analysis with relational query systems and data integration applications. This dissertation aims to convert para-relational data into a high-quality relational form with little user assistance. We have developed four standalone systems, each addressing a specific type of para-relational data. Senbazuru is a prototype spreadsheet database management system that extracts relational information from a large number of spreadsheets. Anthias is an extension of the Senbazuru system to convert a broader range of spreadsheets into a relational format. Lyretail is an extraction system to detect long-tail dictionary entities on webpages. Finally, DiagramFlyer is a web-based search system that obtains a large number of diagrams automatically extracted from web-crawled PDFs. Together, these four systems demonstrate that converting para-relational data into the relational format is possible today, and also suggest directions for future systems.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120853/1/chenzhe_1.pd
    corecore