21 research outputs found

    Language Design for Reactive Systems: On Modal Models, Time, and Object Orientation in Lingua Franca and SCCharts

    Get PDF
    Reactive systems play a crucial role in the embedded domain. They continuously interact with their environment, handle concurrent operations, and are commonly expected to provide deterministic behavior to enable application in safety-critical systems. In this context, language design is a key aspect, since carefully tailored language constructs can aid in addressing the challenges faced in this domain, as illustrated by the various concurrency models that prevent the known pitfalls of regular threads. Today, many languages exist in this domain and often provide unique characteristics that make them specifically fit for certain use cases. This thesis evolves around two distinctive languages: the actor-oriented polyglot coordination language Lingua Franca and the synchronous statecharts dialect SCCharts. While they take different approaches in providing reactive modeling capabilities, they share clear similarities in their semantics and complement each other in design principles. This thesis analyzes and compares key design aspects in the context of these two languages. For three particularly relevant concepts, it provides and evaluates lean and seamless language extensions that are carefully aligned with the fundamental principles of the underlying language. Specifically, Lingua Franca is extended toward coordinating modal behavior, while SCCharts receives a timed automaton notation with an efficient execution model using dynamic ticks and an extension toward the object-oriented modeling paradigm

    Analysing and Reducing Costs of Deep Learning Compiler Auto-tuning

    Get PDF
    Deep Learning (DL) is significantly impacting many industries, including automotive, retail and medicine, enabling autonomous driving, recommender systems and genomics modelling, amongst other applications. At the same time, demand for complex and fast DL models is continually growing. The most capable models tend to exhibit highest operational costs, primarily due to their large computational resource footprint and inefficient utilisation of computational resources employed by DL systems. In an attempt to tackle these problems, DL compilers and auto-tuners emerged, automating the traditionally manual task of DL model performance optimisation. While auto-tuning improves model inference speed, it is a costly process, which limits its wider adoption within DL deployment pipelines. The high operational costs associated with DL auto-tuning have multiple causes. During operation, DL auto-tuners explore large search spaces consisting of billions of tensor programs, to propose potential candidates that improve DL model inference latency. Subsequently, DL auto-tuners measure candidate performance in isolation on the target-device, which constitutes the majority of auto-tuning compute-time. Suboptimal candidate proposals, combined with their serial measurement in an isolated target-device lead to prolonged optimisation time and reduced resource availability, ultimately reducing cost-efficiency of the process. In this thesis, we investigate the reasons behind prolonged DL auto-tuning and quantify their impact on the optimisation costs, revealing directions for improved DL auto-tuner design. Based on these insights, we propose two complementary systems: Trimmer and DOPpler. Trimmer improves tensor program search efficacy by filtering out poorly performing candidates, and controls end-to-end auto-tuning using cost objectives, monitoring optimisation cost. Simultaneously, DOPpler breaks long-held assumptions about the serial candidate measurements by successfully parallelising them intra-device, with minimal penalty to optimisation quality. Through extensive experimental evaluation of both systems, we demonstrate that they significantly improve cost-efficiency of autotuning (up to 50.5%) across a plethora of tensor operators, DL models, auto-tuners and target-devices

    HyFM: Function Merging for Free

    Get PDF
    Function merging is an important optimization for reducing code size. It merges multiple functions into a single one, eliminating duplicate code among them. The existing state-of-the-art relies on a well-known sequence alignment algorithm to identify duplicate code across whole functions. However, this algorithm is quadratic in time and space on the number of instructions. This leads to very high time overheads and prohibitive levels of memory usage even for medium-sized benchmarks. For larger programs, it becomes impractical. This is made worse by an overly eager merging approach. All selected pairs of functions will be merged. Only then will this approach estimate the potential benefit from merging and decide whether to replace the original functions with the merged one. Given that most pairs are unprofitable, a significant amount of time is wasted producing merged functions that are simply thrown away. In this paper, we propose HyFM, a novel function merging technique that delivers similar levels of code size reduction for significantly lower time overhead and memory usage. Unlike the state-of-the-art, our alignment strategy works at the block level. Since basic blocks are usually much shorter than functions, even a quadratic alignment is acceptable. However, we also propose a linear algorithm for aligning blocks of the same size at a much lower cost. We extend this strategy with a multi-tier profitability analysis that bails out early from unprofitable merging attempts. By aligning individual pairs of blocks, we are able to decide their alignment’s profitability separately and before actually generating code. Experimental results on SPEC 2006 and 2017 show that HyFM needs orders of magnitude less memory, using up to 48 MB or 5.6 MB, depending on the variant used, while the state-of-the-art requires 32 GB in the worst case. HyFM also runs over 4.5×× faster, while still achieving comparable code size reduction. Combined with the speedup of later compilation stages due to the reduced number of functions, HyFM contributes to a reduced end-to-end compilation time

    Interactive Model-Based Compilation: A Modeller-Driven Development Approach

    Get PDF
    There is a growing tendency for using domain-specific languages, which help domain experts to stay focussed on abstract problem solutions. It is important to carefully design these languages and tools, which fundamentally perform model-to-model transformations. The quality of both usually decides the effectiveness of the subsequent development and therefore the quality of the final applications. However, as the complexity and safety requirements of modern systems grow, it becomes increasingly burdensome to create highly customized languages and difficult to provide reasonable overviews within these tools. This thesis introduces a new interactive model-based compilation methodology. Compilations for arbitrary model-to-model transformations are themselves described as models. They can be instantiated for particular inputs, e. g. a program, to create concrete compilation runs, which return the result of that compilation. The compilation instance is interactively observable. Intermediate results serve as new inputs and as documentation. They can be used to create highly customized views and facilitate understandability. This methodology guides modellers from the start of the compilation to the final result so that they can interactively refine their models. The methodology has been implemented and validated as the KIELER Compiler (KiCo) and is available as part of the KIELER open-source project. It is used to implement the current reference compiler for the SCCharts language, a statecharts dialect designed for specifying safety-critical reactive systems based on a synchronous model of computation. The interactive model-based compilation approach was key to the rapid prototyping of three different compilation strategies, as well as new language extensions, variations and closely related languages. The results are verified with benchmarks, which are again modelled using the same approach and technology. The usability of the SCCharts language and the KiCo tooling is documented with long-term surveys and real-life industrial, academic and teaching examples

    Cautiously Optimistic Program Analyses for Secure and Reliable Software

    Full text link
    Modern computer systems still have various security and reliability vulnerabilities. Well-known dynamic analyses solutions can mitigate them using runtime monitors that serve as lifeguards. But the additional work in enforcing these security and safety properties incurs exorbitant performance costs, and such tools are rarely used in practice. Our work addresses this problem by constructing a novel technique- Cautiously Optimistic Program Analysis (COPA). COPA is optimistic- it infers likely program invariants from dynamic observations, and assumes them in its static reasoning to precisely identify and elide wasteful runtime monitors. The resulting system is fast, but also ensures soundness by recovering to a conservatively optimized analysis when a likely invariant rarely fails at runtime. COPA is also cautious- by carefully restricting optimizations to only safe elisions, the recovery is greatly simplified. It avoids unbounded rollbacks upon recovery, thereby enabling analysis for live production software. We demonstrate the effectiveness of Cautiously Optimistic Program Analyses in three areas: Information-Flow Tracking (IFT) can help prevent security breaches and information leaks. But they are rarely used in practice due to their high performance overhead (>500% for web/email servers). COPA dramatically reduces this cost by eliding wasteful IFT monitors to make it practical (9% overhead, 4x speedup). Automatic Garbage Collection (GC) in managed languages (e.g. Java) simplifies programming tasks while ensuring memory safety. However, there is no correct GC for weakly-typed languages (e.g. C/C++), and manual memory management is prone to errors that have been exploited in high profile attacks. We develop the first sound GC for C/C++, and use COPA to optimize its performance (16% overhead). Sequential Consistency (SC) provides intuitive semantics to concurrent programs that simplifies reasoning for their correctness. However, ensuring SC behavior on commodity hardware remains expensive. We use COPA to ensure SC for Java at the language-level efficiently, and significantly reduce its cost (from 24% down to 5% on x86). COPA provides a way to realize strong software security, reliability and semantic guarantees at practical costs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170027/1/subarno_1.pd

    Principles of Security and Trust

    Get PDF
    This open access book constitutes the proceedings of the 8th International Conference on Principles of Security and Trust, POST 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conference on Theory and Practice of Software, ETAPS 2019. The 10 papers presented in this volume were carefully reviewed and selected from 27 submissions. They deal with theoretical and foundational aspects of security and trust, including on new theoretical results, practical applications of existing foundational ideas, and innovative approaches stimulated by pressing practical problems

    Interaction-aware analysis and optimization of real-time application and operating system

    Get PDF
    Mechanical and electronic automation was a key component of the technological advances in the last two hundred years. With the use of special-purpose machines, manual labor was replaced by mechanical motion, leaving workers with the operation of these machines, before also this task was conquered by embedded control systems. With the advances of general-purpose computing, the development of these control systems shifted more and more from a problem-specific one to a one-size-fits-all mentality as the trade-off between per-instance overheads and development costs was in favor of flexible and reusable implementations. However, with a scaling factor of thousands, if not millions, of deployed devices, overheads and inefficiencies accumulate; calling for a higher degree of specialization. For the area real-time operating systems (RTOSs), which form the base layer for many of these computerized control systems, we deploy way more flexibility than what is actually required for the applications that run on top of it. Since only the solution, but not the problem, became less specific to the control problem at hand, we have the chance to cut away inefficiencies, improve on system-analyses results, and optimize the resource consumption. However, such a tailoring will only be favorable if it can be performed without much developer interaction and in an automated fashion. Here, real-time systems are a good starting point, since we already have to have a large degree of static knowledge in order to guarantee their timeliness. Until now, this static nature is not exploited to its full extent and optimization potentials are left unused. The requirements of a system, with regard to the RTOS, manifest in the interactions between the application and the kernel. Threads request resources from the RTOS, which in return determines and enforces a scheduling order that will ensure the timely completion of all necessary computations. Since the RTOS runs only in the exception, its reaction to requests from the application (or from the environment) is its defining feature. In this thesis, I will grasp these interactions, and thereby the required RTOS semantic, in a control-flow-sensitive fashion. Extracted automatically, this knowledge about the reciprocal influence allows me to fit the implementation of a system closer to its actual requirements. The result is a system that is not only in its usage a special-purpose system, but also in its implementation and in its provided guarantees. In the development of my approach, it became clear that the focus on these interactions is not only highly fruitful for the optimization of a system, but also for its end-to-end analysis. Therefore, this thesis does not only provide methods to reduce the kernel-execution overhead and a system's memory consumption, but it also includes methods to calculate tighter response-time bounds and to give guarantees about the correct behavior of the kernel. All these contributions are enabled by my proposed interaction-aware methodology that takes the whole system, RTOS and application, into account. With this thesis, I show that a control-flow-sensitive whole-system view on the interactions is feasible and highly rewarding. With this approach, we can overcome many inefficiencies that arise from analyses that have an isolating focus on individual system components. Furthermore, the interaction-aware methods keep close to the actual implementation, and therefore are able to consider the behavioral patterns of the finally deployed real-time computing system

    Defining interfaces between hardware and software: Quality and performance

    Get PDF
    One of the most important interfaces in a computer system is the interface between hardware and software. This interface is the contract between the hardware designer and the programmer that defines the functional behaviour of the hardware. This thesis examines two critical aspects of defining the hardware-software interface: quality and performance. The first aspect is creating a high quality specification of the interface as conventionally defined in an instruction set architecture. The majority of this thesis is concerned with creating a specification that covers the full scope of the interface; that is applicable to all current implementations of the architecture; and that can be trusted to accurately describe the behaviour of implementations of the architecture. We describe the development of a formal specification of the two major types of Arm processors: A-class (for mobile devices such as phones and tablets) and M-class (for micro-controllers). These specifications are unparalleled in their scope, applicability and trustworthiness. This thesis identifies and illustrates what we consider the key ingredient in achieving this goal: creating a specification that is used by many different user groups. Supporting many different groups leads to improved quality as each group finds different problems in the specification; and, by providing value to each different group, it helps justify the considerable effort required to create a high quality specification of a major processor architecture. The work described in this thesis led to a step change in Arm's ability to use formal verification techniques to detect errors in their processors; enabled extensive testing of the specification against Arm's official architecture conformance suite; improved the quality of Arm's architecture conformance suite based on measuring the architectural coverage of the tests; supported earlier, faster development of architecture extensions by enabling animation of changes as they are being made; and enabled early detection of problems created from architecture extensions by performing formal validation of the specification against semi-structured natural language specifications. As far as we are aware, no other mainstream processor architecture has this capability. The formal specifications are included in Arm's publicly released architecture reference manuals and the A-class specification is also released in machine-readable form. The second aspect is creating a high performance interface by defining the hardware-software interface of a software-defined radio subsystem using a programming language. That is, an interface that allows software to exploit the potential performance of the underlying hardware. While the hardware-software interface is normally defined in terms of machine code, peripheral control registers and memory maps, we define it using a programming language instead. This higher level interface provides the opportunity for compilers to hide some of the low-level differences between different systems from the programmer: a potentially very efficient way of providing a stable, portable interface without having to add hardware to provide portability between different hardware platforms. We describe the design and implementation of a set of extensions to the C programming language to support programming high performance, energy efficient, software defined radio systems. The language extensions enable the programmer to exploit the pipeline parallelism typically present in digital signal processing applications and to make efficient use of the asymmetric multiprocessor systems designed to support such applications. The extensions consist primarily of annotations that can be checked for consistency and that support annotation inference in order to reduce the number of annotations required. Reducing the number of annotations does not just save programmer effort, it also improves portability by reducing the number of annotations that need to be changed when porting an application from one platform to another. This work formed part of a project that developed a high-performance, energy-efficient, software defined radio capable of implementing the physical layers of the 4G cellphone standard (LTE), 802.11a WiFi and Digital Video Broadcast (DVB) with a power and silicon area budget that was competitive with a conventional custom ASIC solution. The Arm architecture is the largest computer architecture by volume in the world. It behooves us to ensure that the interface it describes is appropriately defined

    Dynamic contracts for verification and enforcement of real-time systems properties

    Get PDF
    Programa de Doutoramento em Informática (MAP-i) das Universidades do Minho, de Aveiro e do PortoRuntime veri cation is an emerging discipline that investigates methods and tools to enable the veri cation of program properties during the execution of the application. The goal is to complement static analysis approaches, in particular when static veri cation leads to the explosion of states. Non-functional properties, such as the ones present in real-time systems are an ideal target for this kind of veri cation methodology, as are usually out of the range of the power and expressiveness of classic static analyses. Current real-time embedded systems development frameworks lack support for the veri - cation of properties using explicit time where counting time (i.e., durations) may play an important role in the development process. Temporal logics targeting real-time systems are traditionally undecidable. Based on a restricted fragment of Metric temporal logic with durations (MTL-R), we present the proposed synthesis mechanisms 1) for target systems as runtime monitors and 2) for SMT solvers as a way to get, respectively, a verdict at runtime and a schedulability problem to be solved before execution. The later is able to solve partially the schedulability analysis for periodic resource models and xed priority scheduler algorithms. A domain speci c language is also proposed in order to describe such schedulability analysis problems in a more high level way. Finally, we validate both approaches, the rst using empirical scheduling scenarios for unimulti- processor settings, and the second using the use case of the lightweight autopilot system Px4/Ardupilot widely used for industrial and entertainment purposes. The former also shows that certain classes of real-time scheduling problems can be solved, even though without scaling well. The later shows that for the cases where the former cannot be used, the proposed synthesis technique for monitors is well applicable in a real world scenario such as an embedded autopilot ight stack.A verificação do tempo de execução e uma disciplina emergente que investiga métodos e ferramentas para permitir a verificação de propriedades do programa durante a execução da aplicação. O objetivo é complementar abordagens de analise estática, em particular quando a verificação estática se traduz em explosão de estados. As propriedades não funcionais, como as que estão presentes em sistemas em tempo real, são um alvo ideal para este tipo de metodologia de verificação, como geralmente estão fora do alcance do poder e expressividade das análises estáticas clássicas. As atuais estruturas de desenvolvimento de sistemas embebidos para tempo real não possuem suporte para a verificação de propriedades usando o tempo explicito onde a contagem de tempo (ou seja, durações) pode desempenhar um papel importante no processo de desenvolvimento. As logicas temporais que visam sistemas de tempo real são tradicionalmente indecidíveis. Com base num fragmento restrito de MTL-R (metric temporal logic with durations), apresentaremos os mecanismos de síntese 1) para sistemas alvo como monitores de tempo de execução e 2) para solvers SMT como forma de obter, respetivamente, um veredicto em tempo de execução e um problema de escalonamento para ser resolvido antes da execução. O ultimo é capaz de resolver parcialmente a analise de escalonamento para modelos de recursos periódicos e ainda para algoritmos de escalonamento de prioridade fixa. Propomos também uma linguagem especifica de domínio para descrever esses mesmos problemas de analise de escalonamento de forma mais geral e sucinta. Finalmente, validamos ambas as abordagens, a primeira usando cenários de escalonamento empírico para sistemas uni- multi-processador e a segunda usando o caso de uso do sistema de piloto automático leve Px4/Ardupilot amplamente utilizado para fins industriais e de entretenimento. O primeiro mostra que certas classes de problemas de escalonamento em tempo real podem ser solucionadas, embora não seja escalável. O ultimo mostra que, para os casos em que a primeira opção não possa ser usada, a técnica de síntese proposta para monitores aplica-se num cenário real, como uma pilha de voo de um piloto automático embebido.This thesis was partially supported by National Funds through FCT/MEC (Portuguese Foundation for Science and Technology) and co- nanced by ERDF (European Regional Development Fund) under the PT2020 Partnership, within the CISTER Research Unit (CEC/04234); FCOMP-01-0124-FEDER-015006 (VIPCORE) and FCOMP-01-0124-FEDER- 020486 (AVIACC); also by FCT and EU ARTEMIS JU, within project ARTEMIS/0003/2012, JU grant nr. 333053 (CONCERTO); and by FCT/MEC and the EU ARTEMIS JU within project ARTEMIS/0001/2013 - JU grant nr. 621429 (EMC2)

    Acceleration and semantic foundations of embedded Java platforms

    Get PDF
    Tableau d'honneur de la Faculté des études supérieures et postdoctorales, 2006-200
    corecore