2,851 research outputs found
Business Rule Mining from Spreadsheets
Business rules represent the knowledge that guides the operations of a
business organization. They are implemented in software applications used by
organizations, and the activity of extracting them from software is known as
business rule mining. It has various purposes amongst which migration and
generating documentation are the most common. However, apart from conventional
software, organizations also use spreadsheets for a large part of their
operations and decision-making activities. Therefore we believe that
spreadsheets are also rich in business rules. We thus propose to develop an
automated system for extracting business rules from spreadsheets in a human
comprehensible natural language format. This position paper describes our
motivation, the problem description, related work, and challenges we foresee.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in
Spreadsheets (http://spreadsheetlab.org/sems15/
Toward Reverse Engineering of VBA Based Excel Spreadsheet Applications
Modern spreadsheet systems can be used to implement complex spreadsheet
applications including data sheets, customized user forms and executable
procedures written in a scripting language. These applications are often
developed by practitioners that do not follow any software engineering practice
and do not produce any design documentation. Thus, spreadsheet applications may
be very difficult to be maintained or restructured. In this position paper we
present in a nutshell two reverse engineering techniques and a tool that we are
currently realizing for the abstraction of conceptual data models and business
logic models.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in
Spreadsheets (http://spreadsheetlab.org/sems15/
Model inference for spreadsheets
Many errors in spreadsheet formulas can be avoided if spreadsheets are built automati-
cally from higher-level models that can encode and enforce consistency constraints in the generated
spreadsheets. Employing this strategy for legacy spreadsheets is dificult, because the model has
to be reverse engineered from an existing spreadsheet and existing data must be transferred into
the new model-generated spreadsheet.
We have developed and implemented a technique that automatically infers relational schemas
from spreadsheets. This technique uses particularities from the spreadsheet realm to create better
schemas. We have evaluated this technique in two ways: First, we have demonstrated its appli-
cability by using it on a set of real-world spreadsheets. Second, we have run an empirical study
with users. The study has shown that the results produced by our technique are comparable to
the ones developed by experts starting from the same (legacy) spreadsheet data.
Although relational schemas are very useful to model data, they do not t well spreadsheets as
they do not allow to express layout. Thus, we have also introduced a mapping between relational
schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards
it against a large class of formula errors. The developed tool is a contribution to spreadsheet
(reverse) engineering, because it lls an important gap and allows a promising design method
(ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.We would like to thank Orlando Belo for his help on running and analyzing the empirical study. We would also like to thank Paulo Azevedo for his help in conducting the statistical analysis of our empirical study. We would also like to thank the anonymous reviewers for their suggestions which helped us to improve the paper. This work is funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author was also supported by FCT grant SFRH/BPD/73358/2010
Automatically inferring ClassSheet models from spreadsheets
Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints.
However, designing such models is time consuming and requires expertise beyond the knowledge to work with spreadsheets. Legacy spreadsheets pose a particular challenge to the approach of controlling spreadsheet evolution through higher-level models, because the need for a model might be overshadowed by two problems: (A) The benefit of creating a spreadsheet is lacking since the legacy spreadsheet already exists, and (B) existing data must be transferred into the new model-generated spreadsheet.To address these problems and to support the modeldriven spreadsheet engineering approach, we have developed a tool that can automatically infer ClassSheet models from spreadsheets. To this end, we have adapted a method to infer entity/relationship models from relational database to the spreadsheets/ClassSheets realm. We have implemented our techniques in the HAEXCEL framework and integrated it with the ViTSL/Gencel spreadsheet generator, which allows the automatic generation of refactored spreadsheets from the inferred ClassSheet model. The resulting spreadsheet guides further changes and provably safeguards the spreadsheet against a large class of formula errors. The developed tool is a significant contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.(undefined
Information Extraction on Para-Relational Data.
Para-relational data (such as spreadsheets and diagrams) refers to a type of nearly
relational data that shares the important qualities of relational data but does not
present itself in a relational format. Para-relational data often conveys highly valuable
information and is widely used in many different areas. If we can convert para-relational
data into the relational format, many existing tools can be leveraged for a
variety of interesting applications, such as data analysis with relational query systems
and data integration applications.
This dissertation aims to convert para-relational data into a high-quality relational
form with little user assistance. We have developed four standalone systems, each
addressing a specific type of para-relational data. Senbazuru is a prototype spreadsheet
database management system that extracts relational information from a large
number of spreadsheets. Anthias is an extension of the Senbazuru system to convert
a broader range of spreadsheets into a relational format. Lyretail is an extraction
system to detect long-tail dictionary entities on webpages. Finally, DiagramFlyer is
a web-based search system that obtains a large number of diagrams automatically
extracted from web-crawled PDFs. Together, these four systems demonstrate that
converting para-relational data into the relational format is possible today, and also
suggest directions for future systems.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120853/1/chenzhe_1.pd
- …