9,185 research outputs found

    Business Rule Mining from Spreadsheets

    Full text link
    Business rules represent the knowledge that guides the operations of a business organization. They are implemented in software applications used by organizations, and the activity of extracting them from software is known as business rule mining. It has various purposes amongst which migration and generating documentation are the most common. However, apart from conventional software, organizations also use spreadsheets for a large part of their operations and decision-making activities. Therefore we believe that spreadsheets are also rich in business rules. We thus propose to develop an automated system for extracting business rules from spreadsheets in a human comprehensible natural language format. This position paper describes our motivation, the problem description, related work, and challenges we foresee.Comment: In Proceedings of the 2nd Workshop on Software Engineering Methods in Spreadsheets (http://spreadsheetlab.org/sems15/

    XLIndy: interactive recognition and information extraction in spreadsheets

    Get PDF
    Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit (visual and textual) information. This translates into a bottleneck, when it comes to automatic analysis and extraction of information. Therefore, we present XLIndy, a Microsoft Excel add-in with a machine learning back-end, written in Python. It showcases our novel methods for layout inference and table recognition in spreadsheets. For a selected task and method, users can visually inspect the results, change configurations, and compare different runs. This enables iterative fine-tuning. Additionally, users can manually revise the predicted layout and tables, and subsequently save them as annotations. The latter is used to measure performance and (re-)train classifiers. Finally, data in the recognized tables can be extracted for further processing. XLIndy supports several standard formats, such as CSV and JSON.Peer ReviewedPostprint (author's final draft

    THE DECISION SUPPORT SYSTEMS FOR THE INFORMATION SOCIETY (i-Society)

    Get PDF
    The globalization process needs exact information flows that should be collected in due time. The Information Society ensures the communication between people with different expertise from various geographical areas that have similar interests. The increase of the companies’ activities leads implicitly to the increase of the volume and the complexities of databases, as well as the continuous modernization of the integrated information systems in order to collect the information in due time, that is requested by the decision takers and the frequent use of DSS. The paper presents the DSS structure, the main facilities offered by the associated software products, an evolution of the databases technologies, as well as a list of the program products used to process the statistical data and data mining in order to obtain the main sources of information that is necessary to take decisions.Information Society (i-Society); Data Base; Information Systems; Decision Support Systems (DSS); Statistical Package, Portal technology

    Model inference for spreadsheets

    Get PDF
    Many errors in spreadsheet formulas can be avoided if spreadsheets are built automati- cally from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is dificult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: First, we have demonstrated its appli- cability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not t well spreadsheets as they do not allow to express layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it lls an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.We would like to thank Orlando Belo for his help on running and analyzing the empirical study. We would also like to thank Paulo Azevedo for his help in conducting the statistical analysis of our empirical study. We would also like to thank the anonymous reviewers for their suggestions which helped us to improve the paper. This work is funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundacao para a Ciencia e a Tecnologia (Portuguese Foundation for Science and Technology) within project FCOMP-01-0124-FEDER-010048. The first author was also supported by FCT grant SFRH/BPD/73358/2010

    Analyzing the solutions of DEA through information visualization and data mining techniques: SmartDEA framework

    Get PDF
    Data envelopment analysis (DEA) has proven to be a useful tool for assessing efficiency or productivity of organizations, which is of vital practical importance in managerial decision making. DEA provides a significant amount of information from which analysts and managers derive insights and guidelines to promote their existing performances. Regarding to this fact, effective and methodologic analysis and interpretation of DEA solutions are very critical. The main objective of this study is then to develop a general decision support system (DSS) framework to analyze the solutions of basic DEA models. The paper formally shows how the solutions of DEA models should be structured so that these solutions can be examined and interpreted by analysts through information visualization and data mining techniques effectively. An innovative and convenient DEA solver, SmartDEA, is designed and developed in accordance with the proposed analysis framework. The developed software provides a DEA solution which is consistent with the framework and is ready-to-analyze with data mining tools, through a table-based structure. The developed framework is tested and applied in a real world project for benchmarking the vendors of a leading Turkish automotive company. The results show the effectiveness and the efficacy of the proposed framework

    Analytical Challenges in Modern Tax Administration: A Brief History of Analytics at the IRS

    Get PDF

    Spreadsheet engineering

    Get PDF
    These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows). These relations are then used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system. The embedded model is the building block to define techniques for modeldriven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update as to follow the the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data refinement techniques are used to synchronize models and instances after users update/evolve the model. These notes brie y describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes
    corecore