2 research outputs found

    Spreadsheet-based complex data transformation

    Full text link
    Spreadsheets are used by millions of knowledge workers as a routine all-purpose tool for the storage, analysis andmanipulation of data. Given the ubiquity and utility of spreadsheets, it has been indispensable to allow data stored in spreadsheets to interact with external applications and Web services. In this dissertation, we study the problem of spreadsheet-based data transformation, which transforms spreadsheet data to the structured formats required by external applications and Web services. We propose a novel framework, namely TranSheet, including methods and tools for supporting both professional programmers and knowledge workers without programming background to transform spreadsheet data to structured formats effectively and easily.Unlike prior methods, we propose a novel approach in which transformation logic is embedded into a familiar and expressive spreadsheet-like formula mapping language. Popular transformation patterns provided by transformation languages and mapping tools are supported in the language via spreadsheet formulas and functions. Furthermore, the language supports the generalization of a mapping from instance-level to template-level element, which allows the mapping to be applied to multiple spreadsheets with similar structure.To enable users to reuse previously specified mappings using the above proposed language, we formulate the spreadsheet-based data transformation reuse problem and propose a solution that relies on the notions of spreadsheet templates, mapping generalization, and similarity join. Given a spreadsheet instance that is being mapped to the target schema, we efficiently and effectively recommend a list of previously specified mapping formulas that can be potentially reused for the instance.In order to make the aforementioned proposed language available to knowledge workers without programming background as well as boost the productivity of professional programmers, we propose a number of novel end-user oriented transformation techniques. We redesign the mapping interface of TranSheet to make it more intuitive and easy-to-use based on nested tables. We develop a collection of form-based transformation operators that help users graphically specify mappings, instead of remembering and writing complex mapping formulas.The approaches proposed in this dissertation have been implemented in prototypes. Moreover, these approaches are validated via experiments in real applications. Real users are also used to evaluate the usability of TranSheet
    corecore