303 research outputs found

    Evil Pickles: DoS Attacks Based on Object-Graph Engineering

    Get PDF
    In recent years, multiple vulnerabilities exploiting the serialisation APIs of various programming languages, including Java, have been discovered. These vulnerabilities can be used to devise in- jection attacks, exploiting the presence of dynamic programming language features like reflection or dynamic proxies. In this paper, we investigate a new type of serialisation-related vulnerabilit- ies for Java that exploit the topology of object graphs constructed from classes of the standard library in a way that deserialisation leads to resource exhaustion, facilitating denial of service attacks. We analyse three such vulnerabilities that can be exploited to exhaust stack memory, heap memory and CPU time. We discuss the language and library design features that enable these vulnerabilities, and investigate whether these vulnerabilities can be ported to C#, Java- Script and Ruby. We present two case studies that demonstrate how the vulnerabilities can be used in attacks on two widely used servers, Jenkins deployed on Tomcat and JBoss. Finally, we propose a mitigation strategy based on contract injection

    Instant Pickles: Generating Object-Oriented Pickler Combinators for Fast and Extensible Serialization

    Get PDF
    As more applications migrate to the cloud, and as “big data” edges into even more production environments, the performance and simplicity of exchanging data between compute nodes/devices is increasing in importance. An issue central to distributed programming, yet often under-considered, is serialization or pickling, i.e., persisting runtime objects by converting them into a binary or text representation. Pickler combinators are a popular approach from functional programming; their composability alleviates some of the tedium of writing pickling code by hand, but they don’t translate well to object-oriented programming due to qualities like open class hierarchies and subtyping polymorphism. Furthermore, both functional pickler combinators and popular, Java-based serialization frameworks tend to be tied to a specific pickle format, leaving programmers with no choice of how their data is persisted. In this paper, we present object-oriented pickler combinators and a framework for generating them at compile-time, called scala/pickling, designed to be the default serialization mechanism of the Scala programming language. The static generation of OO picklers enables significant performance improvements, outperforming Java and Kryo in most of our benchmarks. In addition to high performance and the need for little to no boilerplate, our framework is extensible: using the type class pattern, users can provide both (1) custom, easily interchangeable pickle formats and (2) custom picklers, to override the default behavior of the pickling framework. In benchmarks, we compare scala/pickling with other popular industrial frameworks, and present results on time, memory usage, and size when pickling/unpickling a number of data types used in real-world, large-scale distributed applications and frameworks

    Typed open programming : a higher-order, typed approach to dynamic modularity and distribution

    Get PDF
    In this dissertation we develop an approach for reconciling open programming the development of programs that support dynamic exchange of higher-order values with other processes with strong static typing in programming languages. We present the design of a concrete programming language, Alice ML, that consists of a conventional functional language extended with a set of orthogonal features like higher-order modules, dynamic type checking, higher-order serialisation, and concurrency. On top of these a flexible system of dynamic components and a simple but expressive notion of distribution is realised. The central concept in this design is the package, a first-class value embedding a module along with its interface type, which is dynamically checked whenever the module is extracted. Furthermore, we develop a formal model for abstract types that is not invalidated by the presence of primitives for dynamic type inspection, as is the case for the standard model based on existential quantification. For that purpose, we present an idealised language in form of an extended -calculus, which can express dynamic generation of types. This calculus is the first to combine and explore the interference of sealing and type inspection with higher-order singleton kinds, a feature for expressing sharing constraints on abstract types. A novel notion of abstracton kinds classifies abstract types. Higher-order type and kind coercions allow for modular translucent encapsulation of values at arbitrary type.In dieser Dissertation entwickeln wir einen programmiersprachlichen Ansatz zur Verbindung offener Programmierung der Entwicklung von Programmen, die das dynamische Laden und Austauschen höherstufiger Werte mit anderen Prozessen erlauben mit starker statischer Typisierung. Wir stellen das Design einer konkreten Programmiersprache namens Alice ML vor. Sie besteht aus einer konventionellen funktionalen Sprache, die um einen Satz orthogonaler Konzepte wie höherstufige Modularisierung, dynamische Typüberprüfung, höherstufige Serialisierung und Nebenläufigkeit erweitert wurde. Darauf aufbauend ist ein flexibles System dynamischer Komponenten sowie ein einfacher aber expressiver Ansatz fur Verteilung verwirklicht. Zentral ist dabei das Konzept eines Pakets (package), welches ein Modul in Kombination mit seinem Schnittstellentyp in einen Wert einbettet, und bei der Extraktion des Moduls eine dynamische Typüberprüfung vornimmt. Weiterhin entwickeln wir einen theoretischen Ansatz zur Modellierung von abstrakten Typen, welcher im Gegensatz zum herkömmlichen formalen Modell existentieller Quantifizierung auch in Gegenwart dynamischer Typinspektion gültig ist. Zu diesem Zweck definieren wir eine idealisierte Sprache in Form eines erweiterten λ-Kalküls, der dynamische Typgenerierung ausdrucken kann. Der Kalkül kombiniert diese erstmals mit höherstufigen Singleton Kinds, einem Sprachkonstrukt, welches Gleichheit von Typen ausdrücken kann. Zur Klassifizierung abstrakter Typen werden Abstraktions-Kinds als verwandtes Konzept entwickelt. Höherstufige Konversionen auf Term- und Typebene erlauben zudem die nachträgliche modulare Enkapsulierung von Werten beliebigen Typs

    A grid middleware for distributed Java computing with MPI binding and process migration supports

    Get PDF
    "Grid" computing has emerged as an important new research field. With years of efforts, grid researchers have successfully developed grid technologies including security solutions, resource management protocols, information query protocols, and data management services. However, as the ultimate goal of grid computing is to design an infrastructure which supports dynamic, cross-organizational resource sharing, there is a need of solutions for efficient and transparent task re-scheduling in the grid. In this research, a new grid middleware is proposed, called G-JavaMPI. This middleware adds the parallel computing capability of Java to the grid with the support of a Grid-enabled message passing interface (MPI) for inter-process communication between Java processes executed at different grid points. A special feature of the proposed G-JavaMPI is the support of Java process migration with post-migration message redirection. With these supports, it is possible to migrate executing Java process from site to site for continuous computation, if some site is scheduled to be turned down for system reconfiguration. Moreover, the proposed G-JavaMPI middleware is very portable since it requires no modification of underlying OS, Java virtual machine, and MPI package. Preliminary performance tests have been conducted. The proposed mechanisms have shown good migration efficiency in a simulated grid environment.postprin

    Language Support for Distributed Functional Programming

    Get PDF
    Software development has taken a fundamental turn. Software today has gone from simple, closed programs running on a single machine, to massively open programs, patching together user experiences byway of responses received via hundreds of network requests spanning multiple machines. At the same time, as data continues to stockpile, systems for big data analytics are on the rise. Yet despite this trend towards distributing computation, issues at the level of the language and runtime abound. Serialization is still a costly runtime affair, crashing running systems and confounding developers. Function closures are being added to APIs for big data processing for use by end-users without reliably being able to transmit them over the network. And much of the frameworks developed for handling multiple concurrent requests byway of asynchronous programming facilities rely on blocking threads, causing serious scalability issues. This thesis describes a number of extensions and libraries for the Scala programming language that aim to address these issues and to provide a more reliable foundation on which to build distributed systems. This thesis presents a new approach to serialization called pickling based on the idea of generating and composing functional pickler combinators statically. The approach shifts the burden of serialization to compile time as much as possible, enabling users to catch serialization errors at compile time rather than at runtime. Further, by virtue of serialization code being generated at compile time, our framework is shown to be significantly more performant than other state-of-the-art serialization frameworks. We also generalize our technique for generating serialization code to generic functions other than pickling. Second, in light of the trend of distributed data-parallel frameworks being designed around functional patterns where closures are transmitted across cluster nodes to large-scale persistent datasets, this thesis introduces a new closure-like abstraction and type system, called spores, that can guarantee closures to be serializable, thread-safe, or even have custom user-defined properties. Crucially, our system is based on the principle of encoding type information corresponding to captured variables in the type of a spore. We prove our type system sound, implement our approach for Scala, evaluate its practicality through a small empirical study, and show the power of these guarantees through a case analysis of real-world distributed and concurrent frameworks that this safe foundation for closures facilitates. Finally, we bring together the above building blocks, pickling and spores, to form the basis of a new programming model called function-passing. Function-passing is based on the idea of a distributed persistent data structure which stores in its nodes transformations to data rather than the distributed data itself, simplifying fault recovery by design. Lazy evaluation is also central to our model; by incorporating laziness into our design only at the point of initiating network communication, our model remains easy to reason about while remaining efficient in time and memory. We formalize our programming model in the form of a small-step operational semantics which includes a precise specification of the semantics of functional fault recovery, and we provide an open-source implementation of our model in and for Scala

    CROCHET: Checkpoint and Rollback via Lightweight Heap Traversal on Stock JVMs

    Get PDF
    Checkpoint/rollback (CR) mechanisms create snapshots of the state of a running application, allowing it to later be restored to that checkpointed snapshot. Support for checkpoint/rollback enables many program analyses and software engineering techniques, including test generation, fault tolerance, and speculative execution. Fully automatic CR support is built into some modern operating systems. However, such systems perform checkpoints at the coarse granularity of whole pages of virtual memory, which imposes relatively high overhead to incrementally capture the changing state of a process, and makes it difficult for applications to checkpoint only some logical portions of their state. CR systems implemented at the application level and with a finer granularity typically require complex developer support to identify: (1) where checkpoints can take place, and (2) which program state needs to be copied. A popular compromise is to implement CR support in managed runtime environments, e.g. the Java Virtual Machine (JVM), but this typically requires specialized, non-standard runtime environments, limiting portability and adoption of this approach. In this paper, we present a novel approach for Checkpoint ROllbaCk via lightweight HEap Traversal (Crochet), which enables fully automatic fine-grained lightweight checkpoints within unmodified commodity JVMs (specifically Oracle\u27s HotSpot and OpenJDK). Leveraging key insights about the internal design common to modern JVMs, Crochet works entirely through bytecode rewriting and standard debug APIs, utilizing special proxy objects to perform a lazy heap traversal that starts at the root references and traverses the heap as objects are accessed, copying or restoring state as needed and removing each proxy immediately after it is used. We evaluated Crochet on the DaCapo benchmark suite, finding it to have very low runtime overhead in steady state (ranging from no overhead to 1.29x slowdown), and that it often outperforms a state-of-the-art system-level checkpoint tool when creating large checkpoints

    HPC-GAP: engineering a 21st-century high-performance computer algebra system

    Get PDF
    Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi-core to high-performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes thatdo not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open-source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross-platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi-core nodes, mid-scale distributed clusters of (multi-core) nodes, and full-blown HPC systems, comprising large-scale tightly-connected networks of multi-core nodes. This requires us to develop new cross-layer programming abstractions in the form of new domain-specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high-performance systems comprising up to 32,000 cores, as well as on ubiquitous multi-core systems and distributed clusters. The work reported here paves the way towards full scale exploitation of symbolic computation by high-performance computing systems, and we demonstrate the potential with two major case studies

    Security analyses for detecting deserialisation vulnerabilities : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    An important task in software security is to identify potential vulnerabilities. Attackers exploit security vulnerabilities in systems to obtain confidential information, to breach system integrity, and to make systems unavailable to legitimate users. In recent years, particularly 2012, there has been a rise in reported Java vulnerabilities. One type of vulnerability involves (de)serialisation, a commonly used feature to store objects or data structures to an external format and restore them. In 2015, a deserialisation vulnerability was reported involving Apache Commons Collections, a popular Java library, which affected numerous Java applications. Another major deserialisation-related vulnerability that affected 55\% of Android devices was reported in 2015. Both of these vulnerabilities allowed arbitrary code execution on vulnerable systems by malicious users, a serious risk, and this came as a call for the Java community to issue patches to fix serialisation related vulnerabilities in both the Java Development Kit and libraries. Despite attention to coding guidelines and defensive strategies, deserialisation remains a risky feature and a potential weakness in object-oriented applications. In fact, deserialisation related vulnerabilities (both denial-of-service and remote code execution) continue to be reported for Java applications. Further, deserialisation is a case of parsing where external data is parsed from their external representation to a program's internal data structures and hence, potentially similar vulnerabilities can be present in parsers for file formats and serialisation languages. The problem is, given a software package, to detect either injection or denial-of-service vulnerabilities and propose strategies to prevent attacks that exploit them. The research reported in this thesis casts detecting deserialisation related vulnerabilities as a program analysis task. The goal is to automatically discover this class of vulnerabilities using program analysis techniques, and to experimentally evaluate the efficiency and effectiveness of the proposed methods on real-world software. We use multiple techniques to detect reachability to sensitive methods and taint analysis to detect if untrusted user-input can result in security violations. Challenges in using program analysis for detecting deserialisation vulnerabilities include addressing soundness issues in analysing dynamic features in Java (e.g., native code). Another hurdle is that available techniques mostly target the analysis of applications rather than library code. In this thesis, we develop techniques to address soundness issues related to analysing Java code that uses serialisation, and we adapt dynamic techniques such as fuzzing to address precision issues in the results of our analysis. We also use the results from our analysis to study libraries in other languages, and check if they are vulnerable to deserialisation-type attacks. We then provide a discussion on mitigation measures for engineers to protect their software against such vulnerabilities. In our experiments, we show that we can find unreported vulnerabilities in Java code; and how these vulnerabilities are also present in widely-used serialisers for popular languages such as JavaScript, PHP and Rust. In our study, we discovered previously unknown denial-of-service security bugs in applications/libraries that parse external data formats such as YAML, PDF and SVG
    corecore