896 research outputs found

    Bricklayer: An Authentic Introduction to the Functional Programming Language SML

    Full text link
    Functional programming languages are seen by many as instrumental to effectively utilizing the computational power of multi-core platforms. As a result, there is growing interest to introduce functional programming and functional thinking as early as possible within the computer science curriculum. Bricklayer is an API, written in SML, that provides a set of abstractions for creating LEGO artifacts which can be viewed using LEGO Digital Designer. The goal of Bricklayer is to create a problem space (i.e., a set of LEGO artifacts) that is accessible and engaging to programmers (especially novice programmers) while providing an authentic introduction to the functional programming language SML.Comment: In Proceedings TFPIE 2014, arXiv:1412.473

    RDF Querying

    Get PDF
    Reactive Web systems, Web services, and Web-based publish/ subscribe systems communicate events as XML messages, and in many cases require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. Emphasizing language design and formal semantics, we describe the rule-based query language XChangeEQ for detecting composite events. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as model and fixpoint theories; while this is an established approach for rule languages, it has not been applied for event queries before

    Web and Semantic Web Query Languages

    Get PDF
    A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This article introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced using common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion

    Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes

    Get PDF
    Software repositories contain a vast wealth of information about software development. Mining these repositories has proven useful for detecting patterns in software development, testing hypotheses for new software engineering approaches, etc. Specifically, mining source code has yielded significant insights into software development artifacts and processes. Unfortunately, mining source code at a large-scale remains a difficult task. Previous approaches had to either limit the scope of the projects studied, limit the scope of the mining task to be more coarse-grained, or sacrifice studying the history of the code due to both human and computational scalability issues. In this paper we address the substantial challenges of mining source code: a) at a very large scale; b) at a fine-grained level of detail; and c) with full history information. To address these challenges, we present domain-specific language features for source code mining. Our language features are inspired by object-oriented visitors and provide a default depth-first traversal strategy along with two expressions for defining custom traversals. We provide an implementation of these features in the Boa infrastructure for software repository mining and describe a code generation strategy into Java code. To show the usability of our domain-specific language features, we reproduced over 40 source code mining tasks from two large-scale previous studies in just 2 person-weeks. The resulting code for these tasks show between 2.0x--4.8x reduction in code size. Finally we perform a small controlled experiment to gain insights into how easily mining tasks written using our language features can be understood, with no prior training. We show a substantial number of tasks (77%) were understood by study participants, in about 3 minutes per task

    LEESA: Embedding Strategic and XPath-Like Object Structure Traversals in C++

    Full text link
    Abstract. Traversals of heterogeneous object structures are the most common operations in schema-first applications where the three key is-sues are (1) separation of traversal specifications from type-specific ac-tions, (2) expressiveness and reusability of traversal specifications, and (3) supporting structure-shy traversal specifications that require min-imal adaptation in the face of schema evolution. This paper presents Language for Embedded quEry and traverSAl (LEESA), which pro-vides a generative programming approach to address the above issues. LEESA is an object structure traversal language embedded in C++. Using C++ templates, LEESA combines the expressiveness of XPath’s axes-oriented traversal notation with the genericity and programmabil-ity of Strategic Programming. LEESA uses the object structure meta-information to statically optimize the traversals and check their compat-ibility against the schema. Moreover, a key usability issue of domain-specific error reporting in embedded DSL languages has been addressed in LEESA through a novel application of Concepts, which is an upcoming C++ standard (C++0x) feature. We present a quantitative evaluation of LEESA illustrating how it can significantly reduce the development efforts of schema-first applications.

    Parallel evaluation strategies for lazy data structures in Haskell

    Get PDF
    Conventional parallel programming is complex and error prone. To improve programmer productivity, we need to raise the level of abstraction with a higher-level programming model that hides many parallel coordination aspects. Evaluation strategies use non-strictness to separate the coordination and computation aspects of a Glasgow parallel Haskell (GpH) program. This allows the specification of high level parallel programs, eliminating the low-level complexity of synchronisation and communication associated with parallel programming. This thesis employs a data-structure-driven approach for parallelism derived through generic parallel traversal and evaluation of sub-components of data structures. We focus on evaluation strategies over list, tree and graph data structures, allowing re-use across applications with minimal changes to the sequential algorithm. In particular, we develop novel evaluation strategies for tree data structures, using core functional programming techniques for coordination control, achieving more flexible parallelism. We use non-strictness to control parallelism more flexibly. We apply the notion of fuel as a resource that dictates parallelism generation, in particular, the bi-directional flow of fuel, implemented using a circular program definition, in a tree structure as a novel way of controlling parallel evaluation. This is the first use of circular programming in evaluation strategies and is complemented by a lazy function for bounding the size of sub-trees. We extend these control mechanisms to graph structures and demonstrate performance improvements on several parallel graph traversals. We combine circularity for control for improved performance of strategies with circularity for computation using circular data structures. In particular, we develop a hybrid traversal strategy for graphs, exploiting breadth-first order for exposing parallelism initially, and then proceeding with a depth-first order to minimise overhead associated with a full parallel breadth-first traversal. The efficiency of the tree strategies is evaluated on a benchmark program, and two non-trivial case studies: a Barnes-Hut algorithm for the n-body problem and sparse matrix multiplication, both using quad-trees. We also evaluate a graph search algorithm implemented using the various traversal strategies. We demonstrate improved performance on a server-class multicore machine with up to 48 cores, with the advanced fuel splitting mechanisms proving to be more flexible in throttling parallelism. To guide the behaviour of the strategies, we develop heuristics-based parameter selection to select their specific control parameters

    A TYPE ANALYSIS OF REWRITE STRATEGIES

    Get PDF
    Rewrite strategies provide an algorithmic rewriting of terms using strategic compositions of rewrite rules. Due to the programmability of rewrites, errors are often made due to incorrect compositions of rewrites or incorrect application of rewrites to a term within a strategic rewriting program. In practical applications of strategic rewriting, testing and debugging becomes substantially time-intensive for large programs applied to large inputs derived from large term grammars. In essence, determining which rewrite in what position in a term did or did not re comes down to logging, tracing and/or di -like comparison of inputs to outputs. In this thesis, we explore type-enabled analysis of strategic rewriting programs to detect errors statically. In particular, we introduce high-precision types to closely approximate the dynamic behavior of rewriting. We also use union types to track sets of types due to presence of strategic compositions. In this framework of high-precision strategic typing, we develop and implement an expressive type system for a representative strategic rewriting language TL. The results of this research are sufficiently broad to be adapted to other strategic rewriting languages. In particular, the type-inferencing algorithm does not require explicit type annotations for minimal impact on an existing language. Based on our experience with the implementation, the type system significantly reduces the time and effort to program correct rewrite strategies while performing the analysis on the order of thousands of source lines of code per second

    Security analyses for detecting deserialisation vulnerabilities : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    An important task in software security is to identify potential vulnerabilities. Attackers exploit security vulnerabilities in systems to obtain confidential information, to breach system integrity, and to make systems unavailable to legitimate users. In recent years, particularly 2012, there has been a rise in reported Java vulnerabilities. One type of vulnerability involves (de)serialisation, a commonly used feature to store objects or data structures to an external format and restore them. In 2015, a deserialisation vulnerability was reported involving Apache Commons Collections, a popular Java library, which affected numerous Java applications. Another major deserialisation-related vulnerability that affected 55\% of Android devices was reported in 2015. Both of these vulnerabilities allowed arbitrary code execution on vulnerable systems by malicious users, a serious risk, and this came as a call for the Java community to issue patches to fix serialisation related vulnerabilities in both the Java Development Kit and libraries. Despite attention to coding guidelines and defensive strategies, deserialisation remains a risky feature and a potential weakness in object-oriented applications. In fact, deserialisation related vulnerabilities (both denial-of-service and remote code execution) continue to be reported for Java applications. Further, deserialisation is a case of parsing where external data is parsed from their external representation to a program's internal data structures and hence, potentially similar vulnerabilities can be present in parsers for file formats and serialisation languages. The problem is, given a software package, to detect either injection or denial-of-service vulnerabilities and propose strategies to prevent attacks that exploit them. The research reported in this thesis casts detecting deserialisation related vulnerabilities as a program analysis task. The goal is to automatically discover this class of vulnerabilities using program analysis techniques, and to experimentally evaluate the efficiency and effectiveness of the proposed methods on real-world software. We use multiple techniques to detect reachability to sensitive methods and taint analysis to detect if untrusted user-input can result in security violations. Challenges in using program analysis for detecting deserialisation vulnerabilities include addressing soundness issues in analysing dynamic features in Java (e.g., native code). Another hurdle is that available techniques mostly target the analysis of applications rather than library code. In this thesis, we develop techniques to address soundness issues related to analysing Java code that uses serialisation, and we adapt dynamic techniques such as fuzzing to address precision issues in the results of our analysis. We also use the results from our analysis to study libraries in other languages, and check if they are vulnerable to deserialisation-type attacks. We then provide a discussion on mitigation measures for engineers to protect their software against such vulnerabilities. In our experiments, we show that we can find unreported vulnerabilities in Java code; and how these vulnerabilities are also present in widely-used serialisers for popular languages such as JavaScript, PHP and Rust. In our study, we discovered previously unknown denial-of-service security bugs in applications/libraries that parse external data formats such as YAML, PDF and SVG
    corecore