Search CORE

9,185 research outputs found

Business Rule Mining from Spreadsheets

Author: Roy Sohon
Publication venue
Publication date: 19/03/2015
Field of study

Business rules represent the knowledge that guides the operations of a business organization. They are implemented in software applications used by organizations, and the activity of extracting them from software is known as business rule mining. It has various purposes amongst which migration and generating documentation are the most common. However, apart from conventional software, organizations also use spreadsheets for a large part of their operations and decision-making activities. Therefore we believe that spreadsheets are also rich in business rules. We thus propose to develop an automated system for extracting business rules from spreadsheets in a human comprehensible natural language format. This position paper describes our motivation, the problem description, related work, and challenges we foresee.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheets (http://spreadsheetlab.org/sems15/

arXiv.org e-Print Archive

TU Delft Repository

XLIndy: interactive recognition and information extraction in spreadsheets

Author: Gonsior Julius
Koci Elvis
Kuban Dana
Lehner Wolfgang
Luetting Nico
Olwig Dominik
Romero Moral Óscar
Thiele Maik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit (visual and textual) information. This translates into a bottleneck, when it comes to automatic analysis and extraction of information. Therefore, we present XLIndy, a Microsoft Excel add-in with a machine learning back-end, written in Python. It showcases our novel methods for layout inference and table recognition in spreadsheets. For a selected task and method, users can visually inspect the results, change configurations, and compare different runs. This enables iterative fine-tuning. Additionally, users can manually revise the predicted layout and tables, and subsequently save them as annotations. The latter is used to measure performance and (re-)train classifiers. Finally, data in the recognized tables can be extracted for further processing. XLIndy supports several standard formats, such as CSV and JSON.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

THE DECISION SUPPORT SYSTEMS FOR THE INFORMATION SOCIETY (i-Society)

Author: Virgil Chichernea
Publication venue
Publication date
Field of study

The globalization process needs exact information flows that should be collected in due time. The Information Society ensures the communication between people with different expertise from various geographical areas that have similar interests. The increase of the companies’ activities leads implicitly to the increase of the volume and the complexities of databases, as well as the continuous modernization of the integrated information systems in order to collect the information in due time, that is requested by the decision takers and the frequent use of DSS. The paper presents the DSS structure, the main facilities offered by the associated software products, an evolution of the databases technologies, as well as a list of the program products used to process the statistical data and data mining in order to obtain the main sources of information that is necessary to take decisions.Information Society (i-Society); Data Base; Information Systems; Decision Support Systems (DSS); Statistical Package, Portal technology

Research Papers in Economics

Model inference for spreadsheets

Author: D Maier
E Visser
EF Codd
J Cunha
J Cunha
JD Ullman
Jorge Mendes
João Saraiva
Jácome Cunha
M Erwig
M Höst
Martin Erwig
R Alhajj
SG Powell
T Cheng
T Connolly
T Isakowitz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automati- cally from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is dificult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: First, we have demonstrated its appli- cability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not t well spreadsheets as they do not allow to express layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it lls an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.We would like to thank Orlando Belo for his help on running and analyzing the empirical study. We would also like to thank Paulo Azevedo for his help in conducting the statistical analysis of our empirical study. We would also like to thank the anonymous reviewers for their suggestions which helped us to improve the paper. This work is funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author was also supported by FCT grant SFRH/BPD/73358/2010

Universidade do Minho: RepositoriUM

Crossref

Analyzing the solutions of DEA through information visualization and data mining techniques: SmartDEA framework

Author: Akcay Alp Eren
Akçay Alp Eren
Buyukozkan Gulcin
Büyüközkan Gülçin
Ertek Gurdal
Ertek Gürdal
Publication venue: 'Elsevier BV'
Publication date: 21/02/2011
Field of study

Data envelopment analysis (DEA) has proven to be a useful tool for assessing efficiency or productivity of organizations, which is of vital practical importance in managerial decision making. DEA provides a significant amount of information from which analysts and managers derive insights and guidelines to promote their existing performances. Regarding to this fact, effective and methodologic analysis and interpretation of DEA solutions are very critical. The main objective of this study is then to develop a general decision support system (DSS) framework to analyze the solutions of basic DEA models. The paper formally shows how the solutions of DEA models should be structured so that these solutions can be examined and interpreted by analysts through information visualization and data mining techniques effectively. An innovative and convenient DEA solver, SmartDEA, is designed and developed in accordance with the proposed analysis framework. The developed software provides a DEA solution which is consistent with the framework and is ready-to-analyze with data mining tools, through a table-based structure. The developed framework is tested and applied in a real world project for benchmarking the vendors of a leading Turkish automotive company. The results show the effectiveness and the efficacy of the proposed framework

Repository TU/e

Sabanci University Research Database

Analytical Challenges in Modern Tax Administration: A Brief History of Analytics at the IRS

Author: Butler Jeff
Publication venue: Ohio State University. Moritz College of Law
Publication date: 01/01/2020
Field of study

KnowledgeBank at OSU

Spreadsheet engineering

Author: Cunha Jácome Miguel Costa
Fernandes João Paulo Soares
Mendes Jorge
Saraiva João Alexandre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows). These relations are then used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system. The embedded model is the building block to define techniques for modeldriven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update as to follow the the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data refinement techniques are used to synchronize models and instances after users update/evolve the model. These notes brie y describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes

Universidade do Minho: RepositoriUM