20 research outputs found

    A comprehensive toolchain for workload characterization across JVM languages

    Get PDF
    The Java Virtual Machine (JVM) today hosts implementations of numerous languages. To achieve high performance, JVM implementations rely on heuristics in choosing compiler optimizations and adapting garbage collection behavior. Historically, these heuristics have been tuned to suit the dynamics of Java programs only. This leads to unnecessarily poor performance in case of non-Java languages, which often exhibit systematic differences in workload behavior. Dynamic metrics characterizing the workload help to identify and quantify useful optimizations, but so far, no cohesive suite of metrics has adequately covered properties that vary systematically between Java and non-Java workloads. We present a suite of such metrics, justifying our choice with reference to a range of guest languages. These metrics are implemented on a common portable infrastructure which ensures ease of deployment and customization

    Workload characterization of JVM languages

    Get PDF
    Being developed with a single language in mind, namely Java, the Java Virtual Machine (JVM) nowadays is targeted by numerous programming languages. Automatic memory management, Just-In-Time (JIT) compilation, and adaptive optimizations provided by the JVM make it an attractive target for different language implementations. Even though being targeted by so many languages, the JVM has been tuned with respect to characteristics of Java programs only -- different heuristics for the garbage collector or compiler optimizations are focused more on Java programs. In this dissertation, we aim at contributing to the understanding of the workloads imposed on the JVM by both dynamically-typed and statically-typed JVM languages. We introduce a new set of dynamic metrics and an easy-to-use toolchain for collecting the latter. We apply our toolchain to applications written in six JVM languages -- Java, Scala, Clojure, Jython, JRuby, and JavaScript. We identify differences and commonalities between the examined languages and discuss their implications. Moreover, we have a close look at one of the most efficient compiler optimizations - method inlining. We present the decision tree of the HotSpot JVM's JIT compiler and analyze how well the JVM performs in inlining the workloads written in different JVM languages

    Advancing Operating Systems via Aspect-Oriented Programming

    Get PDF
    Operating system kernels are among the most complex pieces of software in existence to- day. Maintaining the kernel code and developing new functionality is increasingly compli- cated, since the amount of required features has risen significantly, leading to side ef fects that can be introduced inadvertedly by changing a piece of code that belongs to a completely dif ferent context. Software developers try to modularize their code base into separate functional units. Some of the functionality or “concerns” required in a kernel, however, does not fit into the given modularization structure; this code may then be spread over the code base and its implementation tangled with code implementing dif ferent concerns. These so-called “crosscutting concerns” are especially dif ficult to handle since a change in a crosscutting concern implies that all relevant locations spread throughout the code base have to be modified. Aspect-Oriented Software Development (AOSD) is an approach to handle crosscutting concerns by factoring them out into separate modules. The “advice” code contained in these modules is woven into the original code base according to a pointcut description, a set of interaction points (joinpoints) with the code base. To be used in operating systems, AOSD requires tool support for the prevalent procedu- ral programming style as well as support for weaving aspects. Many interactions in kernel code are dynamic, so in order to implement non-static behavior and improve performance, a dynamic weaver that deploys and undeploys aspects at system runtime is required. This thesis presents an extension of the “C” programming language to support AOSD. Based on this, two dynamic weaving toolkits – TOSKANA and TOSKANA-VM – are presented to permit dynamic aspect weaving in the monolithic NetBSD kernel as well as in a virtual- machine and microkernel-based Linux kernel running on top of L4. Based on TOSKANA, applications for this dynamic aspect technology are discussed and evaluated. The thesis closes with a view on an aspect-oriented kernel structure that maintains coherency and handles crosscutting concerns using dynamic aspects while enhancing de- velopment methods through the use of domain-specific programming languages

    XML Messaging for Mobile Devices

    Get PDF
    In recent years, XML has been widely adopted as a universal format for structured data. A variety of XML-based systems have emerged, most prominently SOAP for Web services, XMPP for instant messaging, and RSS and Atom for content syndication. This popularity is helped by the excellent support for XML processing in many programming languages and by the variety of XML-based technologies for more complex needs of applications. Concurrently with this rise of XML, there has also been a qualitative expansion of the Internet's scope. Namely, mobile devices are becoming capable enough to be full-fledged members of various distributed systems. Such devices are battery-powered, their network connections are based on wireless technologies, and their processing capabilities are typically much lower than those of stationary computers. This dissertation presents work performed to try to reconcile these two developments. XML as a highly redundant text-based format is not obviously suitable for mobile devices that need to avoid extraneous processing and communication. Furthermore, the protocols and systems commonly used in XML messaging are often designed for fixed networks and may make assumptions that do not hold in wireless environments. This work identifies four areas of improvement in XML messaging systems: the programming interfaces to the system itself and to XML processing, the serialization format used for the messages, and the protocol used to transmit the messages. We show a complete system that improves the overall performance of XML messaging through consideration of these areas. The work is centered on actually implementing the proposals in a form usable on real mobile devices. The experimentation is performed on actual devices and real networks using the messaging system implemented as a part of this work. The experimentation is extensive and, due to using several different devices, also provides a glimpse of what the performance of these systems may look like in the future.Matkapuhelimien ja muiden mobiililaitteiden määrä on kasvanut erittäin nopeasti viime vuosina. Laitteiden pieni koko, niiden tarjoamat ohjelmointimahdollisuudet ja langattomat verkkoyhteydet mahdollistavat Internet- ja muiden verkkosovellusten käytön kaikkialla. Akusta johtuva rajallinen käyttöaika, heikko suoritusteho ja verkkokäytön vaatima virta ja aika toimivat kuitenkin selkeinä rajoitteina mobiililaitteiden mahdollisuuksille, ja jotta mobiilimaailma ei joutuisi kokonaan tulevaisuuden Internetin ulkopuolelle, järjestelmien ja sovellusten suunnittelussa on otettava sen erityispiirteet huomioon. Tulevaisuuden verkkosovelluksissa suoran päätelaitteiden välisen viestinnän odotetaan olevan keskeinen osa sovelluksen toimintaa. Nyky-Internetissä tällaisessa viestinnässä käytetään yhä useammin XML-kieltä, joka laajennettavuutensa ja helppokäyttöisyytensä ansiosta vähentää sovelluskehittäjän taakkaa. XML-kielen ongelmina ovat kuitenkin sen vaatimat suuret tiedonsiirto- ja käsittelyajat, jotka ovat olleet esteenä XML:n laajalle käytölle mobiiliympäristöissä. Väitöskirja tutkii XML-pohjaisen laitteiden välisen viestinnän perusedellytyksiä mobiililaitteilla langattomissa verkoissa. Keskeiset tutkimuskohteet ovat tiivis ja tehokkaasti käsiteltävä XML-esitysmuoto, XML:n käsittelyyn paremmin sopivat ohjelmointirajapinnat ja mobiiliympäristön viestiprotokollat. Työn tuloksena on syntynyt mobiililaitteille suunniteltu XML-pohjainen viestintäjärjestelmä, joka on sellaisenaan käytettävissä verkkosovellusten perustana. Järjestelmälle on suoritettu kattavat mittaukset, jotka osoittavat järjestelmän sopivuuden käyttötarkoitukseensa. Tulosten analyysissa otetaan myös huomioon, miten järjestelmän eri ominaisuudet sopivat kuhunkin mobiililaitteiden tukemaan ympäristöön, sekä tarkastellaan, miltä tulevaisuuden mobiililaitteiden suorituskyky saattaisi näyttää

    Self-Adaptive Performance Monitoring for Component-Based Software Systems

    Get PDF
    Effective monitoring of a software system’s runtime behavior is necessary to evaluate the compliance of performance objectives. This thesis has emerged in the context of the Kieker framework addressing application performance monitoring. The contribution includes a self-adaptive performance monitoring approach allowing for dynamic adaptation of the monitoring coverage at runtime. The monitoring data includes performance measures such as throughput and response time statistics, the utilization of system resources, as well as the inter- and intra-component control flow. Based on this data, performance anomaly scores are computed using time series analysis and clustering methods. The self-adaptive performance monitoring approach reduces the business-critical failure diagnosis time, as it saves time-consuming manual debugging activities. The approach and its underlying anomaly scores are extensively evaluated in lab experiments

    Engineering Enterprise Software Systems with Interactive UML Models and Aspect-Oriented Middleware

    Get PDF
    Large scale enterprise software systems are inherently complex and hard to maintain. To deal with this complexity, current mainstream software engineering practices aim at raising the level of abstraction to visual models described in OMG’s UML modeling language. Current UML tools, however, produce static design diagrams for documentation which quickly become out-of-sync with the software, and thus obsolete. To address this issue, current model-driven software development approaches aim at software automation using generators that translate models into code. However, these solutions don’t have a good answer for dealing with legacy source code and the evolution of existing enterprise software systems. This research investigates an alternative solution by making the process of modeling more interactive with a simulator and integrating simulation with the live software system. Such an approach supports model-driven development at a higher-level of abstraction with models without sacrificing the need to drop into a lower-level with code. Additionally, simulation also supports better evolution since the impact of a change to a particular area of existing software can be better understood using simulated “what-if” scenarios. This project proposes such a solution by developing a web-based UML simulator for modeling use cases and sequence diagrams and integrating the simulator with existing applications using aspect-oriented middleware technology

    Software Engineering 2021 : Fachtagung vom 22.-26. Februar 2021 Braunschweig/virtuell

    Get PDF

    Proceedings of the 4th International Conference on Principles and Practices of Programming in Java

    Full text link
    This book contains the proceedings of the 4th international conference on principles and practices of programming in Java. The conference focuses on the different aspects of the Java programming language and its applications

    Data-Efficient Learned Database Components

    Get PDF
    While databases are the backbone of many software systems, database components such as query optimizers often have to be redesigned to meet the increasing variety in workloads, data and hardware designs, which incurs significant engineering efforts to adapt their design. Recently, it was thus proposed to replace DBMS components such as optimizers, cardinality estimators, etc. by ML models, which not only eliminates the engineering efforts but also provides superior performance for many components. The predominant approach to derive such learned components is workload-driven learning where ten thousands of queries have to be executed first to derive the necessary training data. Unfortunately, the training data collection, which can take days even for medium-sized datasets, has to be repeated for every new database (i.e., the combination of dataset, schema and workload) a component should be deployed for. This is especially problematic for cloud databases such as Snowflake or Redshift since this effort has to be incurred for every customer. This dissertation thus proposes data-efficient learned database components, which either reduce or fully eliminate the high costs of training data collection for learned database components. In particular, three directions are proposed in this dissertation, namely (i) we first aim to reduce the number of training queries needed for workload-driven components before we (ii) propose data-driven learning, which uses the data stored in the database as training data instead of queries, and (iii) introduce zero-shot learned components, which can generalize to new databases out-of-the-box, s.t. no training data collection is required. First, we strive to reduce the number of training queries required for workload-driven components by using simulation models to convey the basic tradeoffs of the underlying problem, e.g., that in database partitioning the network costs of shuffling tuples over the network for joins is the dominating factor. This substantially reduces the number of training queries since the basic principles are already covered by the simulation model and thus only subtleties not covered in the simulation model have to be learned by observing query executions, which we will demonstrate for the problem of database partitioning. An alternative direction is to incorporate domain knowledge (e.g., in a cost model we could encode that scan costs increase linearly with the number of tuples) into components by designing them using differentiable programming. This significantly reduces the number of learnable parameters and thus also the number of required training queries. We demonstrate the feasibility of the approach for the problem of cost estimation in databases. While both approaches reduce the number of training queries, there is still a significant number of training queries required for unseen databases. This motivates our second approach of data-driven learning. In particular, we propose to train the database component by learning the data distribution present in a database instead of observing query executions. This not only completely eliminates the need to collect training data queries but can even improve the state-of-the-art in problems such as cardinality estimation or AQP. While we demonstrate the applicability to a wide range of additional database tasks such as the completion of incomplete relational datasets, data-driven learning is only useful for problems where the data distribution provides sufficient information for the underlying database task. However, for tasks where observations of query executions are indispensable such as cost estimation, data-driven learning cannot be leveraged. In a third direction, we thus propose zero-shot learned database components, which are applicable to a broader set of tasks including those that require observations of queries. In particular, motivated by recent advances in transfer learning, we propose to pretrain a model once on a variety of databases and workloads and thus allow the component to generalize to unseen databases out-of-the-box. Hence, similar to data-driven learning no training queries have to be collected. In this dissertation, we demonstrate that zero-shot learning can indeed yield learned cost models which can predict query latencies on entirely unseen databases more accurately than state-of-the-art workload-driven approaches, which require ten thousands of query executions on every unseen database. Overall, the proposed techniques yield state-of-the-art performance for many database tasks while significantly reducing or completely eliminating the expensive training data collection for unseen databases. However, while the proposed directions address the prevalent data-inefficiency of learned database components, there are still many opportunities to improve learned components in the future. First, the robustness and debuggability of learned components should be improved since as of today they do not offer the same transparency as standard code in databases, which can render the components less attractive to be deployed in production systems. Moreover, to increase the applicability of data-driven models it is desirable to increase the coverage of supported queries, e.g., queries involving wildcard predicates on string columns, which are currently not supported by data-driven learning. Finally, we envision that a broader set of tasks should be supported in the future by zero-shot models (e.g., query optimization) potentially converging towards complete zero-shot learned systems
    corecore