2 research outputs found

    Generalizing spreadsheet computation for evolving spreadsheets at scale

    Get PDF
    Spreadsheets are one of the most ubiquitous ad-hoc data analysis and manipulation tools. Their strength over traditional relational database management systems lies in their ability to allow users to manipulate data interactively through an intuitive interface. However, the capabilities of current spreadsheet systems to handle datasets that evolve over time are limited in several dimensions: (a) limited power: it is difficult to perform relational-style queries, which is often needed for large data analysis, while keeping the convenience of formula-like automatic recalculation, (b) limited introspection: the ability to reason about the source of changes between versions at a higher level is often unsupported, and (c) limited interactivity: the computation in spreadsheets at scale can make the system unresponsive, rendering the strength of spreadsheets moot, (d) limited structure utilization: the computation in spreadsheets often fails to utilize the semi-structured nature of real-world spreadsheets. The dissertation discusses developments that overcome these hurdles. First, we discuss an extension to spreadsheet formulae that allows for relational-style queries in a manner that is consistent with typical formula computation engines. Second, we develop the theory of "diffing", representing data updates in a concise manner. Third, we introduce Asynchronous Formula Computation, a technique that improves spreadsheet interactivity when dealing with formula computation, while guaranteeing consistency of the results. Finally, we improve formula computation by utilizing structures of real-world spreadsheets and building a more concise representation
    corecore