54 research outputs found

    Copy-paste Tracking: Fixing Spreadsheets Without Breaking Them

    Get PDF
    Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone. One reason for this is that users actively rely on copy-paste to make up for the lack of abstraction mechanisms. Adding abstraction however, introduces indirection and thus cognitive distance. In this paper we propose an alternative: copy-paste tracking. Tracking copies that spreadsheet users make, allows them to directly edit copy-pasted formulas, but instead of changing only a single instance, the changes will be propagated to all formulas copied from the same source. As a result, spreadsheet users will enjoy the benefits of abstraction without its drawbacks

    Auditing spreadsheets: With or without a tool?

    Full text link
    Spreadsheets are known to be error-prone. Over the last decade, research has been done to determine the causes of the high rate of errors in spreadsheets. This paper examines the added value of a spreadsheet tool (PerfectXL) that visualizes spreadsheet dependencies and determines possible errors in spreadsheets by defining risk areas based on previous work. This paper will firstly discuss the most common mistakes in spreadsheets. Then we will summarize research on spreadsheet tools, focussing on the PerfectXL tool. To determine the perceptions of the usefulness of a spreadsheet tool in general and the PerfectXL tool in particular, we have shown the functionality of PerfectXL to several auditors and have also interviewed them. The results of these interviews indicate that spreadsheet tools support a more effective and efficient audit of spreadsheets; the visualization feature in particular is mentioned by the auditors as being highly supportive for their audit task, whereas the risk feature was deemed of lesser value.Comment: 15 Pages, 2 Tables, 8 Colour Figure

    Gradual Grammars: Syntax in Levels and Locales

    Get PDF
    Programming language implementations are often one-sizefits-all. Irrespective of the ethnographic background or proficiency of their users, they offer a single, canonical syntax for all language users. Whereas professional software developers might be willing to learn a programming language all in one go, this might be a significant barrier for non-technical users, such as children who learn to program, or domain experts using domain-specific languages (DSLs). Parser tools, however, do not offer sufficient support for graduality or internationalization, leading (worst case) to maintaining multiple parsers, for each target class of users. In this paper we present Fabric, a grammar formalism that supports: 1) the gradual extension with (and deprecation of) syntactic constructs in consecutive levels (“vertical”), and, orthogonally, 2) the internationalization of syntax by translating keywords and shuffling sentence order (“horizontal”). This is done in such a way that downstream language processors (compilers, interpreters, type checkers etc.) are affected as little as possible. We discuss the design of Fabric and its implementation on top of the LARK parser generator, and how Fabric can be embedded in the Rascal language workbench. A case study on the gradual programming language Hedy shows that language levels can be represented and internationalized concisely, with hardly any duplication. We evaluate the Fabric library using the Rebel2 DSL, by translating it to Dutch, and “untranslating” its concrete syntax trees, to reuse its existing compiler. Fabric thus provides a principled approach to gradual syntax definition in levels and locales.</p

    Building a needs-based curriculum in data science and artificial intelligence: case studies in Indonesia, Sri Lanka, and Thailand

    Get PDF
    Indonesia and Thailand are middle-income countries within the South-East Asia region. They have well-established and growing higher education systems, increasingly focused on quality improvement. However, they fall behind regional leaders in educating people who design, develop, deploy and train data science and artificial intelligence (DS&AI) based technology, as evident from the technological market, regionally dominated by Singapore and Malaysia, while the region as a whole is far behind China. A similar situation holds also for Sri Lanka, in the South Asia region technologically dominated by India. In this paper, we describe the design of a master's level curriculum in data science and artificial intelligence using European experience on building such curricula. The design of such a curriculum is a nontrivial exercise because there is a constant trade-off between having a sufficiently broad academic curriculum and adequately meeting regional needs, including those of industrial stakeholders. In fact, findings from a gap analysis and assessment of needs from three case studies in Indonesia, Sri Lanka, and Thailand comprise the most significant component of our curriculum development process.The authors would like to thank the European Union Erasmus+ programme which provided funding through the Capacity Building Higher Education project on Curriculum Development in Data Science and Artificial Intelligence, registered under the reference number 599600-EPP-1-2018-1-TH-EPPKA2-CBHE-JP

    Enron

    No full text
    <p>Spreadsheets are used extensively in business processes around the world and as such, a topic of research interest. Over the past few years, many spreadsheet studies have been performed on the EUSES spreadsheet corpus. While this corpus has served the spreadsheet community well, the spreadsheets it contains are mainly gathered with search engines and as such do not represent spreadsheets used in companies. This paper presents a new dataset, extracted for the Enron Email Archive, containing over 15,000 spreadsheets used within the Enron Corporation. In addition to the spreadsheets, we also present an analysis of the associated emails, where we look into spreadsheet specific email behavior.</p> <p>Our analysis shows that 1) 24% of Enron spreadsheets with at least one formula contain an Excel error, 2) there is little diversity in the functions used in spreadsheets: 76% of spreadsheets in the presented corpus only use the same 15 functions and, 3) the spreadsheets are substantially more smelly than the EUSES corpus, especially in terms of long calculation chains. Regarding the emails, we observe that spreadsheets 1) are a frequent topic of email conversation with 10\% of emails either sending or referring spreadsheets and 2) the emails are frequently discussing errors in and updates to spreadsheets.</p

    Peter Hilton on Naming

    No full text

    Data Clone Detection and Visualization in Spreadsheets

    No full text
    <p><strong>Abstract</strong></p> <p>Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems in the prevalence of copy-pasting.</p> <p> </p> <p>In this paper, we study this <em>cloning</em> in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect <em>data clones</em> in spreadsheets: formulas whose values are copied as plain text in a different location.</p> <p> </p> <p>To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats similar to those code clones pose and 3) our approach supports users in finding and resolving data clones.</p
    corecore