896 research outputs found
Bricklayer: An Authentic Introduction to the Functional Programming Language SML
Functional programming languages are seen by many as instrumental to
effectively utilizing the computational power of multi-core platforms. As a
result, there is growing interest to introduce functional programming and
functional thinking as early as possible within the computer science
curriculum. Bricklayer is an API, written in SML, that provides a set of
abstractions for creating LEGO artifacts which can be viewed using LEGO Digital
Designer. The goal of Bricklayer is to create a problem space (i.e., a set of
LEGO artifacts) that is accessible and engaging to programmers (especially
novice programmers) while providing an authentic introduction to the functional
programming language SML.Comment: In Proceedings TFPIE 2014, arXiv:1412.473
RDF Querying
Reactive Web systems, Web services, and Web-based publish/
subscribe systems communicate events as XML messages, and in
many cases require composite event detection: it is not sufficient to react
to single event messages, but events have to be considered in relation to
other events that are received over time.
Emphasizing language design and formal semantics, we describe the
rule-based query language XChangeEQ for detecting composite events.
XChangeEQ is designed to completely cover and integrate the four complementary
querying dimensions: event data, event composition, temporal
relationships, and event accumulation. Semantics are provided as
model and fixpoint theories; while this is an established approach for rule
languages, it has not been applied for event queries before
Web and Semantic Web Query Languages
A number of techniques have been developed to facilitate
powerful data retrieval on the Web and Semantic Web. Three categories
of Web query languages can be distinguished, according to the format
of the data they can retrieve: XML, RDF and Topic Maps. This article
introduces the spectrum of languages falling into these categories
and summarises their salient aspects. The languages are introduced using
common sample data and query types. Key aspects of the query
languages considered are stressed in a conclusion
Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes
Software repositories contain a vast wealth of information about software development. Mining these repositories has proven useful for detecting patterns in software development, testing hypotheses for new software engineering approaches, etc. Specifically, mining source code has yielded significant insights into software development artifacts and processes. Unfortunately, mining source code at a large-scale remains a difficult task. Previous approaches had to either limit the scope of the projects studied, limit the scope of the mining task to be more coarse-grained, or sacrifice studying the history of the code due to both human and computational scalability issues. In this paper we address the substantial challenges of mining source code: a) at a very large scale; b) at a fine-grained level of detail; and c) with full history information.
To address these challenges, we present domain-specific language features for source code mining. Our language features are inspired by object-oriented visitors and provide a default depth-first traversal strategy along with two expressions for defining custom traversals. We provide an implementation of these features in the Boa infrastructure for software repository mining and describe a code generation strategy into Java code. To show the usability of our domain-specific language features, we reproduced over 40 source code mining tasks from two large-scale previous studies in just 2 person-weeks. The resulting code for these tasks show between 2.0x--4.8x reduction in code size. Finally we perform a small controlled experiment to gain insights into how easily mining tasks written using our language features can be understood, with no prior training. We show a substantial number of tasks (77%) were understood by study participants, in about 3 minutes per task
LEESA: Embedding Strategic and XPath-Like Object Structure Traversals in C++
Abstract. Traversals of heterogeneous object structures are the most common operations in schema-first applications where the three key is-sues are (1) separation of traversal specifications from type-specific ac-tions, (2) expressiveness and reusability of traversal specifications, and (3) supporting structure-shy traversal specifications that require min-imal adaptation in the face of schema evolution. This paper presents Language for Embedded quEry and traverSAl (LEESA), which pro-vides a generative programming approach to address the above issues. LEESA is an object structure traversal language embedded in C++. Using C++ templates, LEESA combines the expressiveness of XPath’s axes-oriented traversal notation with the genericity and programmabil-ity of Strategic Programming. LEESA uses the object structure meta-information to statically optimize the traversals and check their compat-ibility against the schema. Moreover, a key usability issue of domain-specific error reporting in embedded DSL languages has been addressed in LEESA through a novel application of Concepts, which is an upcoming C++ standard (C++0x) feature. We present a quantitative evaluation of LEESA illustrating how it can significantly reduce the development efforts of schema-first applications.
Parallel evaluation strategies for lazy data structures in Haskell
Conventional parallel programming is complex and error prone. To improve programmer
productivity, we need to raise the level of abstraction with a higher-level
programming model that hides many parallel coordination aspects. Evaluation
strategies use non-strictness to separate the coordination and computation aspects
of a Glasgow parallel Haskell (GpH) program. This allows the specification of high
level parallel programs, eliminating the low-level complexity of synchronisation and
communication associated with parallel programming.
This thesis employs a data-structure-driven approach for parallelism derived through
generic parallel traversal and evaluation of sub-components of data structures. We
focus on evaluation strategies over list, tree and graph data structures, allowing
re-use across applications with minimal changes to the sequential algorithm.
In particular, we develop novel evaluation strategies for tree data structures, using
core functional programming techniques for coordination control, achieving more
flexible parallelism. We use non-strictness to control parallelism more flexibly. We
apply the notion of fuel as a resource that dictates parallelism generation, in particular,
the bi-directional flow of fuel, implemented using a circular program definition,
in a tree structure as a novel way of controlling parallel evaluation. This is the first
use of circular programming in evaluation strategies and is complemented by a lazy
function for bounding the size of sub-trees.
We extend these control mechanisms to graph structures and demonstrate performance
improvements on several parallel graph traversals. We combine circularity
for control for improved performance of strategies with circularity for computation
using circular data structures. In particular, we develop a hybrid traversal strategy
for graphs, exploiting breadth-first order for exposing parallelism initially, and
then proceeding with a depth-first order to minimise overhead associated with a full
parallel breadth-first traversal.
The efficiency of the tree strategies is evaluated on a benchmark program, and
two non-trivial case studies: a Barnes-Hut algorithm for the n-body problem and
sparse matrix multiplication, both using quad-trees. We also evaluate a graph search
algorithm implemented using the various traversal strategies.
We demonstrate improved performance on a server-class multicore machine with
up to 48 cores, with the advanced fuel splitting mechanisms proving to be more
flexible in throttling parallelism. To guide the behaviour of the strategies, we develop
heuristics-based parameter selection to select their specific control parameters
A TYPE ANALYSIS OF REWRITE STRATEGIES
Rewrite strategies provide an algorithmic rewriting of terms using strategic compositions of rewrite rules. Due to the programmability of rewrites, errors are often made due to incorrect compositions of rewrites or incorrect application of rewrites to a term within a strategic rewriting program. In practical applications of strategic rewriting, testing and debugging becomes substantially time-intensive for large programs applied to large inputs derived from large term grammars. In essence, determining which rewrite in what position in a term did or did not re comes down to logging, tracing and/or di -like comparison of inputs to outputs. In this thesis, we explore type-enabled analysis of strategic rewriting programs to detect errors statically. In particular, we introduce high-precision types to closely approximate the dynamic behavior of rewriting. We also use union types to track sets of types due to presence of strategic compositions. In this framework of high-precision strategic typing, we develop and implement an expressive type system for a representative strategic rewriting language TL. The results of this research are sufficiently broad to be adapted to other strategic rewriting languages. In particular, the type-inferencing algorithm does not require explicit type annotations for minimal impact on an existing language. Based on our experience with the implementation, the type system significantly reduces the time and effort to program correct rewrite strategies while performing the analysis on the order of thousands of source lines of code per second
Security analyses for detecting deserialisation vulnerabilities : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand
An important task in software security is to identify potential vulnerabilities. Attackers exploit security vulnerabilities in systems to obtain confidential information, to breach system integrity, and to make systems unavailable to legitimate users. In recent years, particularly 2012, there has been a rise in reported Java vulnerabilities. One type of vulnerability involves (de)serialisation, a commonly used feature to store objects or data structures to an external format and restore them. In 2015, a deserialisation vulnerability was reported involving Apache Commons Collections, a popular Java library, which affected numerous Java applications. Another major deserialisation-related vulnerability that affected 55\% of Android devices was reported in 2015. Both of these vulnerabilities allowed arbitrary code execution on vulnerable systems by malicious users, a serious risk, and this came as a call for the Java community to issue patches to fix serialisation related vulnerabilities in both the Java Development Kit and libraries.
Despite attention to coding guidelines and defensive strategies, deserialisation remains a risky feature and a potential weakness in object-oriented applications. In fact, deserialisation related vulnerabilities (both denial-of-service and remote code execution) continue to be reported for Java applications. Further, deserialisation is a case of parsing where external data is parsed from their external representation to a program's internal data structures and hence, potentially similar vulnerabilities can be present in parsers for file formats and serialisation languages.
The problem is, given a software package, to detect either injection or denial-of-service vulnerabilities and propose strategies to prevent attacks that exploit them. The research reported in this thesis casts detecting deserialisation related vulnerabilities as a program analysis task. The goal is to automatically discover this class of vulnerabilities using program analysis techniques, and to experimentally evaluate the efficiency and effectiveness of the proposed methods on real-world software. We use multiple techniques to detect reachability to sensitive methods and taint analysis to detect if untrusted user-input can result in security violations.
Challenges in using program analysis for detecting deserialisation vulnerabilities include addressing soundness issues in analysing dynamic features in Java (e.g., native code). Another hurdle is that available techniques mostly target the analysis of applications rather than library code.
In this thesis, we develop techniques to address soundness issues related to analysing Java code that uses serialisation, and we adapt dynamic techniques such as fuzzing to address precision issues in the results of our analysis. We also use the results from our analysis to study libraries in other languages, and check if they are vulnerable to deserialisation-type attacks. We then provide a discussion on mitigation measures for engineers to protect their software against such vulnerabilities.
In our experiments, we show that we can find unreported vulnerabilities in Java code; and how these vulnerabilities are also present in widely-used serialisers for popular languages such as JavaScript, PHP and Rust. In our study, we discovered previously unknown denial-of-service security bugs in applications/libraries that parse external data formats such as YAML, PDF and SVG
- …