310 research outputs found
Language Design for Reactive Systems: On Modal Models, Time, and Object Orientation in Lingua Franca and SCCharts
Reactive systems play a crucial role in the embedded domain. They continuously interact with their environment, handle concurrent operations, and are commonly expected to provide deterministic behavior to enable application in safety-critical systems. In this context, language design is a key aspect, since carefully tailored language constructs can aid in addressing the challenges faced in this domain, as illustrated by the various concurrency models that prevent the known pitfalls of regular threads. Today, many languages exist in this domain and often provide unique characteristics that make them specifically fit for certain use cases. This thesis evolves around two distinctive languages: the actor-oriented polyglot coordination language Lingua Franca and the synchronous statecharts dialect SCCharts. While they take different approaches in providing reactive modeling capabilities, they share clear similarities in their semantics and complement each other in design principles. This thesis analyzes and compares key design aspects in the context of these two languages. For three particularly relevant concepts, it provides and evaluates lean and seamless language extensions that are carefully aligned with the fundamental principles of the underlying language. Specifically, Lingua Franca is extended toward coordinating modal behavior, while SCCharts receives a timed automaton notation with an efficient execution model using dynamic ticks and an extension toward the object-oriented modeling paradigm
Learned interpreters : structural and learned systematicity in neural networks for program execution
Les architectures de rĂ©seaux de neurones profonds Ă usage gĂ©nĂ©ral ont fait des progrĂšs surprenants dans l'apprentissage automatique pour le code, permettant lâamĂ©lioration de la complĂ©tion de code, la programmation du langage naturel, la dĂ©tection et la rĂ©paration des bogues, et mĂȘme la rĂ©solution de problĂšmes de programmation compĂ©titifs Ă un niveau de performance humain. NĂ©anmoins, ces mĂ©thodes ont du mal Ă comprendre le processus d'exĂ©cution du code, mĂȘme lorsqu'il s'agit de code qu'ils Ă©crivent eux-mĂȘmes. Ă cette fin, nous explorons une architecture du rĂ©seau neuronal inspirĂ© dâinterprĂ©teur de code, via une nouvelle famille d'architecture appelĂ©e Instruction Pointer Attention Graph Neural Networks (IPA-GNN). Nous appliquons cette famille d'approches Ă plusieurs tĂąches nĂ©cessitant un raisonnement sur le comportement d'exĂ©cution du programme : apprendre Ă exĂ©cuter des programmes complets et partiels, prĂ©dire la couverture du code pour la vĂ©rification du matĂ©riel, et prĂ©dire les erreurs d'exĂ©cution dans des programmes de compĂ©tition. GrĂące Ă cette sĂ©rie de travaux, nous apportons plusieurs contributions et rencontrons de multiples rĂ©sultats surprenants et prometteurs. Nous introduisons une bibliothĂšque Python pour construire des reprĂ©sentations de graphes des programmes utiles dans la recherche sur l'apprentissage automatique, qui sert de fondement Ă la recherche dans cette thĂšse et dans la communautĂ© de recherche plus large. Nous introduisons Ă©galement de riches ensembles de donnĂ©es Ă grande Ă©chelle de programmes annotĂ©s avec le comportement du programme (les sorties et les erreurs soulevĂ©es lors de son exĂ©cution) pour faciliter la recherche dans ce domaine. Nous constatons que les mĂ©thodes IPA-GNN prĂ©sentent une forte gĂ©nĂ©ralisation amĂ©liorĂ©e par rapport aux mĂ©thodes Ă usage gĂ©nĂ©ral, fonctionnant bien lorsqu'ils sont entraĂźnĂ©s pour exĂ©cuter uniquement des programmes courts mais testĂ©s sur des programmes plus longs. En fait, nous constatons que les mĂ©thodes IPA-GNN surpassent les mĂ©thodes gĂ©nĂ©riques sur chacune des tĂąches de modĂ©lisation du comportement que nous considĂ©rons dans les domaines matĂ©riel et logiciel. Nous constatons mĂȘme que les mĂ©thodes inspirĂ©es par l'interprĂ©teur de code qui modĂ©lisent explicitement la gestion des exceptions ont une propriĂ©tĂ© interprĂ©tative souhaitable, permettant la prĂ©diction des emplacements d'erreur mĂȘme lorsqu'elles n'ont Ă©tĂ© entraĂźnĂ©es qu'Ă prĂ©dire la prĂ©sence d'erreur et le type d'erreur. Au total, les architectures inspirĂ©es des interprĂ©teurs de code comme l'IPA-GNN reprĂ©sentent un chemin prometteur Ă suivre pour imprĂ©gner des rĂ©seaux de neurones avec de nouvelles capacitĂ©s pour apprendre Ă raisonner sur les exĂ©cutions de programme.General purpose deep neural network architectures have made startling advances in machine learning for code, advancing code completion, enabling natural language programming, detecting and repairing bugs, and even solving competitive programming problems at a human level of performance. Nevertheless, these methods struggle to understand the execution behavior of code, even when it is code they write themselves. To this end, we explore interpreter-inspired neural network architectures, introducing a novel architecture family called instruction pointer attention graph neural networks (IPA-GNN). We apply this family of approaches to several tasks that require reasoning about the execution behavior of programs: learning to execute full and partial programs, code coverage prediction for hardware verification, and predicting runtime errors in competition programs. Through this series of works we make several contributions and encounter multiple surprising and promising results. We introduce a Python library for constructing graph representations of programs for use in machine learning research, which serves as a bedrock for the research in this thesis and in the broader research community. We also introduce rich large scale datasets of programs annotated with program behavior like outputs and errors raised to facilitate research in this domain. We find that IPA-GNN methods exhibit improved strong generalization over general purpose methods, performing well when trained to execute only on short programs and tested on significantly longer programs. In fact, we find that IPA-GNN methods outperform generic methods on each of the behavior modeling tasks we consider across both hardware and software domains. We even find that interpreter-inspired methods that model exception handling explicitly have a desirable interpretability property, enabling the prediction of error locations even when only trained on error presence and kind. In total, interpreter-inspired architectures like the IPA-GNN represent a promising path forward for imbuing neural networks with novel capabilities for learning to reason about program executions
The modern landscape of managing effects for the working programmer
The management of side effects is a crucial aspect of modern programming, especially in concurrent and distributed systems. This thesis analyses different approaches for managing side effects in programming languages, specifically focusing on unrestricted side effects, monads, and algebraic effects and handlers. Unrestricted side effects, used in mainstream imperative programming languages, can make programs difficult to reason about. Monads offer a solution to this problem by describing side effects in a composable and referentially transparent way but many find them cumbersome to use. Algebraic effects and handlers can address some of the shortcomings of monads by providing a way to model effects in more modular and flexible way. The thesis discusses the advantages and disadvantages of each of these approaches and compares them based on factors such as expressiveness, safety, and constraints they place on how programs must be implemented. The thesis focuses on ZIO, a Scala library for concurrent and asynchronous programming, which revolves around a ZIO monad with three type parameters. With those three parameters ZIO can encode the majority of practically useful effects in a single monad. ZIO takes inspiration from algebraic effects, combining them with monadic effects. The library provides a range of features, such as declarative concurrency, error handling, and resource management. The thesis presents examples of using ZIO to manage side effects in practical scenarios, highlighting its strengths over other approaches. The applicability of ZIO is evaluated by implementing a server side application using ZIO, and analyzing observations from the development process
Automated cache optimisations of stencil computations for partial differential equations
This thesis focuses on numerical methods that solve partial differential equations.
Our focal point is the finite difference method, which solves partial
differential equations by approximating derivatives with explicit finite differences.
These partial differential equation solvers consist of stencil computations on structured grids.
Stencils for computing real-world practical applications are patterns often
characterised by many memory accesses and non-trivial arithmetic expressions
that lead to high computational costs compared to simple stencils used in much prior
proof-of-concept work.
In addition, the loop nests to express stencils on structured grids may often be complicated.
This work is highly motivated by a specific domain of stencil computations where one of the challenges is non-aligned to the structured grid ("off-the-grid") operations.
These operations update neighbouring grid points through scatter and gather operations via non-affine memory accesses, such as {A[B[i]]}.
In addition to this challenge, these practical stencils often include many computation fields (need to store multiple grid copies), complex data dependencies and imperfect loop nests.
In this work, we aim to increase the performance of stencil kernel execution.
We study automated cache-memory-dependent optimisations for stencil computations.
This work consists of two core parts with their respective contributions.The first part of our work tries to reduce the data movement in stencil computations of practical interest.
Data movement is a dominant factor affecting the performance of high-performance computing applications.
It has long been a target of optimisations due to its impact on execution time and energy consumption.
This thesis tries to relieve this cost by applying temporal blocking optimisations, also known as time-tiling, to stencil computations.
Temporal blocking is a well-known technique to enhance data reuse in stencil computations.
However, it is rarely used in practical applications but rather in theoretical examples to prove its efficacy.
Applying temporal blocking to scientific simulations is more complex.
More specifically, in this work, we focus on the application context of seismic and medical imaging.
In this area, we often encounter scatter and gather operations due to signal sources and receivers at arbitrary locations in the computational domain.
These operations make the application of temporal blocking challenging.
We present an approach to overcome this challenge and successfully apply temporal blocking.In the second part of our work, we extend the first part as an automated approach targeting a wide range of simulations modelled with partial differential equations.
Since temporal blocking is error-prone, tedious to apply by hand and highly complex to assimilate theoretically and practically, we are motivated to automate its application and automatically generate code that benefits from it.
We discuss algorithmic approaches and present a generalised compiler pipeline to automate the application of temporal blocking.
These passes are written in the Devito compiler. They are used to accelerate the computation of stencil kernels in areas such as seismic and medical imaging, computational fluid dynamics and machine learning.
\href{www.devitoproject.org}{Devito} is a Python package to implement optimised stencil computation (e.g., finite differences, image processing, machine learning) from high-level symbolic problem definitions.
Devito builds on \href{www.sympy.org}{SymPy} and employs automated code generation and just-in-time compilation to execute optimised computational kernels on several computer platforms, including CPUs, GPUs, and clusters thereof.
We show how we automate temporal blocking code generation without user intervention and often achieve better time-to-solution.
We enable domain-specific optimisation through compiler passes and offer temporal blocking gains from a high-level symbolic abstraction.
These automated optimisations benefit various computational kernels for solving real-world application problems.Open Acces
One-sided differentiability: a challenge for computer algebra systems
Computer Algebra Systems (CASs) are extremely powerful and widely used digital tools. Focusing on differentiation, CASs include a command that computes the derivative of functions in one variable (and also the partial derivative of functions in several variables). We will focus in this article on real-valued functions of one real variable. Since CASs usually compute the derivative of real-valued functions as a whole, the value of the computed derivative at points where the left derivative and the right derivative are different (that we will call conflicting points) should be something like "undefined", although this isn't always the case: the output could strongly differ depending on the chosen CAS. We have analysed and compared in this article how some well-known CASs behave when addressing differentiation at the conflicting points of five different functions chosen by the authors. Finally, the ability for calculating one-sided limits of CASs allows to directly compute the result in these cumbersome cases using the formal definition of one-sided derivative, which we have also analysed and compared for the selected CASs. Regarding teaching, this is an important issue, as it is a topic of Secondary Education and nowadays the use of CASs as an auxiliary digital tool for teaching mathematics is very common
Practical synthesis from real-world oracles
As software systems become increasingly heterogeneous, the ability of compilers to reason about an entire system has decreased. When components of a system are not implemented as traditional programs, but rather as specialised hardware, optimised architecture-specific libraries, or network services, the compiler is unable to cross these abstraction barriers and analyse the system as a whole.
If these components could be modelled or understood as programs, then the compiler would be able to reason about their behaviour without concern for their internal implementation details: a homogeneous view of the entire system would be afforded. However, it is not often the case that such components ever corresponded to an original program. This means that to facilitate this homogenenous analysis, programmatic models of component behaviour must be learned or constructed automatically.
Constructing these models is an inductive program synthesis problem, albeit a challenging one that is largely beyond the ability of existing implementations. In order for the problem to be made tractable, information provided by the underlying context (i.e. the real component behaviour to be matched) must be integrated.
This thesis presents three program synthesis approaches that integrate contextual information to synthesise programmatic models for real, existing components. The first, Annote, exploits informally-encoded information about a component's interface (e.g. from documentation) by weaving that information into an extended type-and-attribute system for component interfaces. The second, Presyn, learns a pair of cooperating probabilistic models from prior syntheses, that aim to predict likely program structure based on a component's interface. Finally, Haze uses observations of common side-effects of component executions to bias the search for programs. These approaches are each evaluated against comparable synthesisers from the literature, on a set of benchmark problems derived from real components.
Learning models for component behaviour is only a partial solution; the compiler must also have some mechanism to use those models for program analysis and transformation. This thesis additionally proposes a novel mechanism for context-sensitive automatic API migration based on synthesised programmatic models, and evaluates the effectiveness of doing so on real application code.
In summary, this thesis proposes a new framing for program synthesis problems that target the behaviour of real components, and demonstrates three different potential approaches to synthesis in this spirit. The success of these approaches is evaluated against implementations from the literature, and their results used to drive a novel API migration technique
Towards Porting Operating Systems with Program Synthesis
The end of Moore's Law has ushered in a diversity of hardware not seen in
decades. Operating system (and system software) portability is accordingly
becoming increasingly critical. Simultaneously, there has been tremendous
progress in program synthesis. We set out to explore the feasibility of using
modern program synthesis to generate the machine-dependent parts of an
operating system. Our ultimate goal is to generate new ports automatically from
descriptions of new machines. One of the issues involved is writing
specifications, both for machine-dependent operating system functionality and
for instruction set architectures. We designed two domain-specific languages:
Alewife for machine-independent specifications of machine-dependent operating
system functionality and Cassiopea for describing instruction set architecture
semantics. Automated porting also requires an implementation. We developed a
toolchain that, given an Alewife specification and a Cassiopea machine
description, specializes the machine-independent specification to the target
instruction set architecture and synthesizes an implementation in assembly
language with a customized symbolic execution engine. Using this approach, we
demonstrate successful synthesis of a total of 140 OS components from two
pre-existing OSes for four real hardware platforms. We also developed several
optimization methods for OS-related assembly synthesis to improve scalability.
The effectiveness of our languages and ability to synthesize code for all 140
specifications is evidence of the feasibility of program synthesis for
machine-dependent OS code. However, many research challenges remain; we also
discuss the benefits and limitations of our synthesis-based approach to
automated OS porting.Comment: ACM Transactions on Programming Languages and Systems. Accepted on
August 202
Ontology-based transformation of natural language queries into SPARQL queries by evolutionary algorithms
In dieser Arbeit wird ein ontologiegetriebenes evolutionĂ€res Lernsystem fĂŒr natĂŒrlichsprachliche Abfragen von RDF-Graphen vorgestellt. Das lernende System beantwortet die Anfrage nicht selbst, sondern generiert eine SPARQL-Abfrage gegen die Datenbank.
Zu diesem Zweck wird das Evolutionary Dataflow Agents Framework eingefĂŒhrt, ein allgemeines Lernsystem, das auf der Grundlage evolutionĂ€rer Algorithmen Agenten erzeugt, die lernen, ein Problem zu lösen. Die Hauptidee des Frameworks ist es, Probleme zu unterstĂŒtzen, die einen mittelgroĂen Suchraum (Anwendungsfall: Analyse von natĂŒrlichsprachlichen Abfragen) von streng formal strukturierten Lösungen (Anwendungsfall: Synthese von Datenbankabfragen) mit eher lokalen klassischen strukturellen und algorithmischen Aspekten kombinieren. Dabei kombinieren die Agenten lokale algorithmische FunktionalitĂ€t von Knoten mit einem flexiblen Datenfluss zwischen den Knoten zu einem globalen Problemlösungsprozess. Grob gesagt gibt es Knoten, die Informationsfragmente generieren, indem sie Eingabedaten und/oder frĂŒhere Fragmente kombinieren, oft unter Verwendung von auf Heuristik basierenden Vermutungen. Andere Knoten kombinieren, sammeln und reduzieren solche Fragmente auf mögliche Lösungen und grenzen diese auf die endgĂŒltige Lösung ein. Zu diesem Zweck werden die Informationen von den Agenten weitergegeben. Die Konfiguration dieser Agenten, welche Knoten sie kombinieren und wohin genau die Daten flieĂen, ist Gegenstand des Lernens. Das Training beginnt mit einfachen Agenten, die - wie in Lern-Frameworks ĂŒblich - eine Reihe von Aufgaben lösen und dafĂŒr bewertet werden. Da die erzeugten Antworten in der Regel komplexe Strukturen aufweisen, setzt das Framework einen neuartigen feinkörnigen energiebasierten Bewertungs- und Auswahlschritt ein. Die ausgewĂ€hlten Agenten bilden dann die Grundlage fĂŒr die Population der nĂ€chsten Runde. Die Evolution wird wie ĂŒblich durch Mutationen und Agentenfusion gewĂ€hrleistet.
Als Anwendungsfall wurde EvolNLQ implementiert, ein System zur Beantwortung natĂŒrlichsprachlicher Abfragen gegen RDF-Datenbanken. HierfĂŒr wird die zugrundeliegende Ontologie medatata (extern) algorithmisch vorverarbeitet. FĂŒr die Agenten werden geeignete Datenelementtypen und Knotentypen definiert, die die Prozesse der Sprachanalyse und der Anfragesynthese in mehr oder weniger elementare Operationen zerlegen. Die "GröĂe" der Operationen wird bestimmt durch die Grenze zwischen Berechnungen, d.h. rein algorithmischen Schritten (implementiert in einzelnen mĂ€chtigen Knoten) und einfachen heuristischen Schritten (ebenfalls realisiert durch einfache Knoten), und freiem Datenfluss, der beliebige Verkettungen und Verzweigungskonfigurationen der Agenten erlaubt. EvolNLQ wird mit einigen anderen AnsĂ€tzen verglichen und zeigt konkurrenzfĂ€hige Ergebnisse.In this thesis an ontology-driven evolutionary learning system for natural language querying of RDF graphs is presented. The learning system itself does not answer the query, but generates a SPARQL query against the database.
For this purpose, the Evolutionary Dataflow Agents framework, a general learning framework is introduced that, based on evolutionary algorithms, creates agents that learn to solve a problem. The main idea of the framework is to support problems that combine a medium-sized search space (use case: analysis of natural language queries) of strictly, formally structured solutions (use case: synthesis of database queries), with rather local classical structural and algorithmic aspects. For this, the agents combine local algorithmic functionality of nodes with a flexible dataflow between the nodes to a global problem solving process. Roughly, there are nodes that generate informational fragments by combining input data and/or earlier fragments, often using heuristics-based guessing. Other nodes combine, collect, and reduce such fragments towards possible solutions, and narrowing these towards the unique final solution. For this, informational items are floating through the agents. The configuration of these agents, what nodes they combine, and where exactly the data items are flowing, is subject to learning. The training starts with simple agents, which âas usual in learning frameworksâ solve a set of tasks, and are evaluated for it. Since the produced answers usually have complex structures answers, the framework employs a novel fine-grained energy-based evaluation and selection step. The selected agents then are the basis for the population of the next round. Evolution is provided as usual by mutations and agent fusion.
As a use case, EvolNLQ has been implemented, a system for answering natural language queries against RDF databases. For this, the underlying ontology medatata is (externally) algorithmically preprocessed. For the agents, appropriate data item types and node types are defined that break down the processes of language analysis and query synthesis into more or less elementary operations. The "size" of operations is determined by the border between computations, i.e., purely algorithmic steps (implemented in individual powerful nodes) and simple heuristic steps (also realized by simple nodes), and free dataflow allowing for arbitrary chaining and branching configurations of the agents. EvolNLQ is compared with some other approaches, showing competitive results.2022-01-2
Lectures on Applied Mathematics Part 2: Numerical Analysis
This book is designed to be a continuation of the textbook, Lectures on Applied Mathematics Part I: Linear Algebra which can also be downloaded at http://rbowen.engr.tamu.edu. This textbook evolved from my teaching an undergraduate Numerical Analysis course to Mechanical Engineering students at Texas A&M University. That course was one of the courses I was allowed to teach after my several years out of the classroom. It tries to utilize rigorous concepts in Linear Algebra in combination with the powerful computational tools of MATLAB to provide undergraduate students practical numerical analysis tools. It makes extensive use of MATLAB's graphics capabilities and, to a limited extent, its ability to animate the solutions of ordinary differential equations. It is not a textbook that tries to be comprehensive as a source of MATLAB information. It does contain a large number of links to MATLAB's extensive online resources. This information has been invaluable to me as this work was developed. The version of MATLAB used in the preparation of this textbook is MATLAB 2019b.Chapter 7: Elements of Numerical Linear Algebra; Chapter 8: Errors that Arise in Numerical Analysis; Chapter 9: Roots of Nonlinear Equations; Chapter 10: Regression; Chapter 11: Interpolation; Chapter 12: Ordinary Differential Equations; Appendix A: Introduction to MATLAB; Appendix B: Animation
TRIZ Future Conference 2004
TRIZ the Theory of Inventive Problem Solving is a living science and a practical methodology: millions of patents have been examined to look for principles of innovation and patterns of excellence. Large and small companies are using TRIZ to solve problems and to develop strategies for future technologies. The TRIZ Future Conference is the annual meeting of the European TRIZ Association, with contributions from everywhere in the world. The aims of the 2004 edition are the integration of TRIZ with other methodologies and the dissemination of systematic innovation practices even through SMEs: a broad spectrum of subjects in several fields debated with experts, practitioners and TRIZ newcomers
- âŠ