20 research outputs found

    Proceedings of the 4th International Conference on Principles and Practices of Programming in Java

    Full text link
    This book contains the proceedings of the 4th international conference on principles and practices of programming in Java. The conference focuses on the different aspects of the Java programming language and its applications

    Java Virtual Machine Optimizations for Java and Dynamic Languages

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 문수묵.Java virtual machine (JVM) has been introduced as the machine-independent run- time environment to run a Java program. As a 32-bit stack machine, JVM can execute bytecode instructions generated through compilation of a Java program on any ma- chine if the JVM runtime was correctly ported on it. The machine-independence of JVM brought about the huge success of both the Java programming language and the Java virtual machine itself on various systems encompassing from cloud servers to embedded systems including handsets and smart cards. Since a bytecode instruction should be interpreted by the JVM runtime for execu- tion on top of a specific underlying system, a Java program runs innately slower due to the interpretation overhead than a C/C++ program that is compiled directly for the sys- tem. Java just-in-time (JIT) compilers, the de facto performance add-on modules, are employed to improve the performance of a Java virtual machine (JVM) by translating Java bytecode into native machine code on demand. One important problem in Java JIT compilation is how to map stack entries and local variables of the JVM runtime to physical registers efficiently and quickly, since register-based computations are much faster than memory-based ones, while JIT com- pilation overhead is part of the whole running time. This paper introduces LaTTe, an open-source Java JIT compiler that performs fast generation of efficiently register- mapped RISC code. LaTTe first maps all local variables and stack entries into pseudo registers, followed by real register allocation which also coalesces copies correspond- ing to pushes and pops between local variables and stack entries aggressively. In ad- dition to the efficient register allocation, LaTTe is equipped with various traditional and object-oriented optimizations such as CSE, dynamic method inlining, and special- ization. We also devised new mechanisms for Java exception handling and monitor handling in LaTTe, named on-demand exception handling and lightweight monitor, respectively, to boost up the JVM performance more. Our experimental results indicate that LaTTes sophisticated register mapping and allocation really pay off, achieving twice the performance of a naive JIT compiler that maps all local variables and stack entries to memory. It is also shown that LaTTe makes a reasonable trade-off between quality and speed of register mapping and allocation for the bytecode. We expect these results will also be beneficial to parallel and distributed Java computing 1) by enhancing single-thread Java performance and 2) by significantly reducing the number of memory accesses which the rest of the system must properly order to maintain coherence and keep threads synchronized. Furthermore, Java virtual machine (JVM) has recently evolved into a general- purpose language runtime environment to execute popular programming languages such as JavaScript, Ruby, Python, or Scala. These languages have complex non-Java features including dynamic typing and first-class function, so additional language run- times (engines) are provided on top of the JVM to support them with bytecode ex- tensions. Although there are high-performance JVMs with powerful just-in-time (JIT) compilers, running these languages efficiently on the JVM is still a challenge. This paper introduces a simple and novel technique for the JVM JIT compiler called exceptionization to improve the performance of JVM-based language runtimes. We observed that the JVM executing some non-Java languages encounters at least 2 times more branch bytecodes than Java, most of which are highly biased to take only one target. Exceptionization treats such a highly-biased branch as some implicit exception-throwing instruction. This allows the JVM JIT compiler to prune the infre- quent target of the branch from the frequent control flow, thus compiling the frequent control flow more aggressively with better optimization. If a pruned path was taken, it would run like a Java exception handler, i.e., a catch block. We also devised de- exceptionization, a mechanism to cope with the case when a pruned path is actually executed more often than expected. Since exceptionization is a generic JVM optimization, independent of any specific language runtime, it would be generally applicable to any language runtime on the JVM. Our experimental result shows that exceptionization accelerates the performance of several non-Java languages. The JavaScript-on-JVM runs faster by as much as 60%, and by 6% on average, when running the Octane benchmark suite on Oracles latest Nashorn JavaScript engine and HotSpot 1.9 JVM. Additionally, the Ruby-on-JVM experiences the performance improvement by as much as 60% and by 6% on average, while the Python-on-JVM by as much as 6%. We found that exceptionization is most effectively applicable to the branch bytecode of the language runtime itself, rather than the bytecode corresponding to the application code or the bytecode of the Java class libraries. This implies that the performance benefit of exceptionization comes from better JIT compilation of the non-Java language runtime.1. Introduction 1 2. Java Virtual Machine Optimization for Java 6 3. Java Virtual Machine Optimization for Dynamic Languages 39 4. Summary and Conclusion 76 Abstract (In Korean) 84Docto

    Protecting Systems From Exploits Using Language-Theoretic Security

    Get PDF
    Any computer program processing input from the user or network must validate the input. Input-handling vulnerabilities occur in programs when the software component responsible for filtering malicious input---the parser---does not perform validation adequately. Consequently, parsers are among the most targeted components since they defend the rest of the program from malicious input. This thesis adopts the Language-Theoretic Security (LangSec) principle to understand what tools and research are needed to prevent exploits that target parsers. LangSec proposes specifying the syntactic structure of the input format as a formal grammar. We then build a recognizer for this formal grammar to validate any input before the rest of the program acts on it. To ensure that these recognizers represent the data format, programmers often rely on parser generators or parser combinators tools to build the parsers. This thesis propels several sub-fields in LangSec by proposing new techniques to find bugs in implementations, novel categorizations of vulnerabilities, and new parsing algorithms and tools to handle practical data formats. To this end, this thesis comprises five parts that tackle various tenets of LangSec. First, I categorize various input-handling vulnerabilities and exploits using two frameworks. First, I use the mismorphisms framework to reason about vulnerabilities. This framework helps us reason about the root causes leading to various vulnerabilities. Next, we built a categorization framework using various LangSec anti-patterns, such as parser differentials and insufficient input validation. Finally, we built a catalog of more than 30 popular vulnerabilities to demonstrate the categorization frameworks. Second, I built parsers for various Internet of Things and power grid network protocols and the iccMAX file format using parser combinator libraries. The parsers I built for power grid protocols were deployed and tested on power grid substation networks as an intrusion detection tool. The parser I built for the iccMAX file format led to several corrections and modifications to the iccMAX specifications and reference implementations. Third, I present SPARTA, a novel tool I built that generates Rust code that type checks Portable Data Format (PDF) files. The type checker I helped build strictly enforces the constraints in the PDF specification to find deviations. Our checker has contributed to at least four significant clarifications and corrections to the PDF 2.0 specification and various open-source PDF tools. In addition to our checker, we also built a practical tool, PDFFixer, to dynamically patch type errors in PDF files. Fourth, I present ParseSmith, a tool to build verified parsers for real-world data formats. Most parsing tools available for data formats are insufficient to handle practical formats or have not been verified for their correctness. I built a verified parsing tool in Dafny that builds on ideas from attribute grammars, data-dependent grammars, and parsing expression grammars to tackle various constructs commonly seen in network formats. I prove that our parsers run in linear time and always terminate for well-formed grammars. Finally, I provide the earliest systematic comparison of various data description languages (DDLs) and their parser generation tools. DDLs are used to describe and parse commonly used data formats, such as image formats. Next, I conducted an expert elicitation qualitative study to derive various metrics that I use to compare the DDLs. I also systematically compare these DDLs based on sample data descriptions available with the DDLs---checking for correctness and resilience

    Automated Security Analysis of Web Application Technologies

    Get PDF
    TheWeb today is a complex universe of pages and applications teeming with interactive content that we use for commercial and social purposes. Accordingly, the security of Web applications has become a concern of utmost importance. Devising automated methods to help developers to spot security flaws and thereby make the Web safer is a challenging but vital area of research. In this thesis, we leverage static analysis methods to automatically discover vulnerabilities in programs written in JavaScript or PHP. While JavaScript is the number one language fueling the client-side logic of virtually every Web application, PHP is the most widespread language on the server side. In the first part, we use a series of program transformations and information flow analysis to examine the JavaScript Helios voting client. Helios is a stateof- the-art voting system that has been exhaustively analyzed by the security community on a conceptual level and whose implementation is claimed to be highly secure. We expose two severe and so far undiscovered vulnerabilities. In the second part, we present a framework allowing developers to analyze PHP code for vulnerabilities that can be freely modeled. To do so, we build socalled code property graphs for PHP and import them into a graph database. Vulnerabilities can then be modeled as appropriate database queries. We show how to model common vulnerabilities and evaluate our framework in a large-scale study, spotting hundreds of vulnerabilities.DasWeb hat sich zu einem komplexen Netz aus hochinteraktiven Seiten und Anwendungen entwickelt, welches wir täglich zu kommerziellen und sozialen Zwecken einsetzen. Dementsprechend ist die Sicherheit von Webanwendungen von höchster Relevanz. Das automatisierte Auffinden von Sicherheitslücken ist ein anspruchsvolles, aber wichtiges Forschungsgebiet mit dem Ziel, Entwickler zu unterstützen und das Web sicherer zu machen. In dieser Arbeit nutzen wir statische Analysemethoden, um automatisiert Lücken in JavaScript- und PHP-Programmen zu entdecken. JavaScript ist clientseitig die wichtigste Sprache des Webs, während PHP auf der Serverseite am weitesten verbreitet ist. Im ersten Teil nutzen wir eine Reihe von Programmtransformationen und Informationsflussanalyse, um den JavaScript HeliosWahl-Client zu untersuchen. Helios ist ein modernesWahlsystem, welches auf konzeptueller Ebene eingehend analysiert wurde und dessen Implementierung als sehr sicher gilt. Wir enthüllen zwei schwere und bis dato unentdeckte Sicherheitslücken. Im zweiten Teil präsentieren wir ein Framework, das es Entwicklern ermöglicht, PHP Code auf frei modellierbare Schwachstellen zu untersuchen. Zu diesem Zweck konstruieren wir sogenannte Code-Property-Graphen und importieren diese anschließend in eine Graphdatenbank. Schwachstellen können nun als geeignete Datenbankanfragen formuliert werden. Wir zeigen, wie wir herkömmliche Schwachstellen modellieren können und evaluieren unser Framework in einer groß angelegten Studie, in der wir hunderte Sicherheitslücken identifizieren.CISP

    A microservice architecture for the processing of large geospatial data in the Cloud

    Get PDF
    With the growing number of devices that can collect spatiotemporal information, as well as the improving quality of sensors, the geospatial data volume increases constantly. Before the raw collected data can be used, it has to be processed. Currently, expert users are still relying on desktop-based Geographic Information Systems to perform processing workflows. However, the volume of geospatial data and the complexity of processing algorithms exceeds the capacities of their workstations. There is a paradigm shift from desktop solutions towards the Cloud, which offers virtually unlimited storage space and computational power, but developers of processing algorithms often have no background in computer science and hence no expertise in Cloud Computing. Our research hypothesis is that a microservice architecture and Domain-Specific Languages can be used to orchestrate existing geospatial processing algorithms, and to compose and execute geospatial workflows in a Cloud environment for efficient application development and enhanced stakeholder experience. We present a software architecture that contains extension points for processing algorithms (or microservices), a workflow management component for distributed service orchestration, and a workflow editor based on a Domain-Specific Language. The main aim is to provide both users and developers with the means to leverage the possibilities of the Cloud, without requiring them to have a deep knowledge of distributed computing. In order to conduct our research, we follow the Design Science Research Methodology. We perform an analysis of the problem domain and collect requirements as well as quality attributes for our architecture. To meet our research objectives, we design the architecture and develop approaches to workflow management and workflow modelling. We demonstrate the utility of our solution by applying it to two real-world use cases and evaluate the quality of our architecture based on defined scenarios. Finally, we critically discuss our results. Our contributions to the scientific community can be classified into three pillars. We present a scalable and modifiable microservice architecture for geospatial processing that supports distributed development and has a high availability. Further, we present novel approaches to service integration and orchestration in the Cloud as well as rule-based and dynamic workflow management without a priori design-time knowledge. For the workflow modelling we create a Domain-Specific Language that is based on a novel language design method

    Pattern-Based Vulnerability Discovery

    Get PDF
    corecore