234 research outputs found

    Algorithmic Verification of Asynchronous Programs

    Full text link
    Asynchronous programming is a ubiquitous systems programming idiom to manage concurrent interactions with the environment. In this style, instead of waiting for time-consuming operations to complete, the programmer makes a non-blocking call to the operation and posts a callback task to a task buffer that is executed later when the time-consuming operation completes. A co-operative scheduler mediates the interaction by picking and executing callback tasks from the task buffer to completion (and these callbacks can post further callbacks to be executed later). Writing correct asynchronous programs is hard because the use of callbacks, while efficient, obscures program control flow. We provide a formal model underlying asynchronous programs and study verification problems for this model. We show that the safety verification problem for finite-data asynchronous programs is expspace-complete. We show that liveness verification for finite-data asynchronous programs is decidable and polynomial-time equivalent to Petri Net reachability. Decidability is not obvious, since even if the data is finite-state, asynchronous programs constitute infinite-state transition systems: both the program stack and the task buffer of pending asynchronous calls can be potentially unbounded. Our main technical construction is a polynomial-time semantics-preserving reduction from asynchronous programs to Petri Nets and conversely. The reduction allows the use of algorithmic techniques on Petri Nets to the verification of asynchronous programs. We also study several extensions to the basic models of asynchronous programs that are inspired by additional capabilities provided by implementations of asynchronous libraries, and classify the decidability and undecidability of verification questions on these extensions.Comment: 46 pages, 9 figure

    Timing analysis of synchronous data flow graphs

    Get PDF
    Consumer electronic systems are getting more and more complex. Consequently, their design is getting more complicated. Typical systems built today are made of different subsystems that work in parallel in order to meet the functional re- quirements of the demanded applications. The types of applications running on such systems usually have inherent timing constraints which should be realized by the system. The analysis of timing guarantees for parallel systems is not a straightforward task. One important category of applications in consumer electronic devices are multimedia applications such as an MP3 player and an MPEG decoder/encoder. Predictable design is the prominent way of simultaneously managing the design complexity of these systems and providing timing guarantees. Timing guarantees cannot be obtained without using analyzable models of computation. Data flow models proved to be a suitable means for modeling and analysis of multimedia applications. Synchronous Data Flow Graphs (SDFGs) is a data flow model of computation that is traditionally used in the domain of Digital Signal Processing (DSP) platforms. Owing to the structural similarity between DSP and multimedia applications, SDFGs are suitable for modeling multimedia applications as well. Besides, various performance metrics can be analyzed using SDFGs. In fact, the combination of expressivity and analysis potential makes SDFGs very interesting in the domain of multimedia applications. This thesis contributes to SDFG analysis. We propose necessary and sufficient conditions to analyze the integrity of SDFGs and we provide techniques to capture prominent performance metrics, namely, throughput and latency. These perfor- mance metrics together with the mentioned sanity checks (conditions) build an appropriate basis for the analysis of the timing behavior of modeled applications. An SDFG is a graph with actors as vertices and channels as edges. Actors represent basic parts of an application which need to be executed. Channels represent data dependencies between actors. Streaming applications essentially continue their execution indefinitely. Therefore, one of the key properties of an SDFG which models such an application is liveness, i.e., whether all actors can run infinitely often. For example, one is usually not interested in a system which completely or partially deadlocks. Another elementary requirement known as boundedness, is whether an implementation of an SDFG is feasible using a lim- ited amount of memory. Necessary and sufficient conditions for liveness and the different types of boundedness are given, as well as algorithms for checking those conditions. Throughput analysis of SDFGs is an important step for verifying throughput requirements of concurrent real-time applications, for instance within design-space exploration activities. In fact, the main reason that SDFGs are used for mod- eling multimedia applications is analysis of the worst-case throughput, as it is essential for providing timing guarantees. Analysis of SDFGs can be hard, since the worst-case complexity of analysis algorithms is often high. This is also true for throughput analysis. In particular, many algorithms involve a conversion to another kind of data flow graph, namely, a homogenous data flow graph, whose size can be exponentially larger than the size of the original graph and in practice often is much larger. The thesis presents a method for throughput analysis of SD- FGs which is based on explicit state-space exploration, avoiding the mentioned conversion. The method, despite its worst-case complexity, works well in practice, while existing methods often fail. Since the state-space exploration method is akin to the simulation of the graph, the result can be easily obtained as a byproduct in existing simulation tools. In various contexts, such as design-space exploration or run-time reconfigu- ration, many throughput computations are required for varying actor execution times. The computations need to be fast because typically very limited resources or time can be dedicated to the analysis. In this thesis, we present methods to compute throughput of an SDFG where execution times of actors can be param- eters. As a result, the throughput of these graphs is obtained in the form of a function of these parameters. Calculation of throughput for different actor exe- cution times is then merely an evaluation of this function for specific parameter values, which is much faster than the standard throughput analysis. Although throughput is a very useful performance indicator for concurrent real-time applications, another important metric is latency. Especially for appli- cations such as video conferencing, telephony and games, latency beyond a certain limit cannot be tolerated. The final contribution of this thesis is an algorithm to determine the minimal achievable latency, providing an execution scheme for executing an SDFG with this latency. In addition, a heuristic is proposed for optimizing latency under a throughput constraint. This heuristic gives optimal latency and throughput results in most cases

    Normal Forms and Equivalence of K-periodically Routed Graphs

    Get PDF
    We introduce K-periodically Routed Graphs, which are extensions of Marked Graphs with routing nodes, governed by ultimately periodic binary sequences. We study data relations and dependencies, as well as equational transformations of the network topology. We show the existence of expanded normal forms. We prove that some transformations preserve external flow equivalence. Issues arising from internal flow interleavings and permutations are also tackled

    Worst-case throughput analysis for parametric rate and parametric actor execution time scenario-aware dataflow graphs

    Get PDF
    Scenario-aware dataflow (SADF) is a prominent tool for modeling and analysis of dynamic embedded dataflow applications. In SADF the application is represented as a finite collection of synchronous dataflow (SDF) graphs, each of which represents one possible application behaviour or scenario. A finite state machine (FSM) specifies the possible orders of scenario occurrences. The SADF model renders the tightest possible performance guarantees, but is limited by its finiteness. This means that from a practical point of view, it can only handle dynamic dataflow applications that are characterized by a reasonably sized set of possible behaviours or scenarios. In this paper we remove this limitation for a class of SADF graphs by means of SADF model parametrization in terms of graph port rates and actor execution times. First, we formally define the semantics of the model relevant for throughput analysis based on (max,+) linear system theory and (max,+) automata. Second, by generalizing some of the existing results, we give the algorithms for worst-case throughput analysis of parametric rate and parametric actor execution time acyclic SADF graphs with a fully connected, possibly infinite state transition system. Third, we demonstrate our approach on a few realistic applications from digital signal processing (DSP) domain mapped onto an embedded multi-processor architecture

    Генерация потоковых сетей акторов поиска кратчайших путей для параллельной многоядерной реализации

    Get PDF
    Objectives. The problem of parallelizing computations on multicore systems is considered. On the Floyd – Warshall blocked algorithm of shortest paths search in dense graphs of large size, two types of parallelism are compared: fork-join and network dataflow. Using the CAL programming language, a method of developing actors and an algorithm of generating parallel dataflow networks are proposed. The objective is to improve performance of parallel implementations of algorithms which have the property of partial order of computations on multicore processors.Methods. Methods of graph theory, algorithm theory, parallelization theory and formal language theory are used.Results. Claims about the possibility of reordering calculations in the blocked Floyd – Warshall algorithm are proved, which make it possible to achieve a greater load of cores during algorithm execution. Based on the claims, a method of constructing actors in the CAL language is developed and an algorithm for automatic generation of dataflow CAL networks for various configurations of block matrices describing the lengths of the shortest paths is proposed. It is proved that the networks have the properties of rate consistency, boundedness, and liveness. In actors running in parallel, the order of execution of actions with asynchronous behavior can change dynamically, resulting in efficient use of caches and increased core load. To implement the new features of actors, networks and the method of their generation, a tunable multi-threaded CAL engine has been developed that implements a static dataflow model of computation with bounded sizes of buffers. From the experimental results obtained on four types of multi-core processors it follows that there is an optimal size of the network matrix of actors for which the performance is maximum, and the size depends on the number of cores and the size of graph.Conclusion. It has been shown that dataflow networks of actors are an effective means to parallelize computationally intensive algorithms that describe a partial order of computations over decomposed data. The results obtained on the blocked algorithm of shortest paths search prove that the parallelism of dataflow networks gives higher performance of software implementations on multicore processors in comparison with the fork-join parallelism of OpenMP.Цели. Рассматривается задача распараллеливания вычислений на многоядерных системах. Посредством блочного алгоритма Флойда – Уоршалла поиска кратчайших путей на плотных графах большого размера сравниваются два вида параллелизма: разветвление/слияние и сетевой потоковый. С использованием языка программирования CAL разрабатываются метод построения акторов потока данных и алгоритм генерации параллельных сетей акторов. Целью работы является повышение производительности параллельных сетевых реализаций алгоритмов, обладающих свойством частичного порядка вычислений, на многоядерных процессорах.Методы. Используются методы теории графов, теории алгоритмов, теории распараллеливания, теории формальных языков.Результаты. Доказаны утверждения о возможности переупорядочивания вычислений в блочном алгоритме Флойда – Уоршалла, способствующие повышению загрузки ядер при реализации алгоритма. На основе утверждений разработан метод построения акторов на языке CAL и предложен алгоритм автоматической генерации CAL-сетей потока данных для различных конфигураций матриц блоков, описывающих длины кратчайших путей. Доказано, что сети обладают свойствами согласованности, ограниченности и живучести. В акторах, работающих параллельно, порядок выполнения действий с асинхронным поведением может динамически меняться, что приводит к эффективному использованию кэшей и увеличению загрузки ядер. Для реализации новых возможностей акторов, сетей и метода их генерации разработан настраиваемый многопоточный CAL-движок, реализующий статическую модель потоковых вычислений с ограниченными размерами буферов. Из экспериментальных результатов, полученных на четырех типах многоядерных процессоров, следует, что существует оптимальный размер сетевой матрицы акторов, для которого производительность максимальна, и этот размер зависит от размера графа и количества ядер.Заключение. Показано, что сети акторов потока данных являются эффективным средством распарал-леливания алгоритмов с высокой вычислительной нагрузкой, описывающих частичный порядок вычислений над данными, декомпозированными на части. Результаты, полученные на блочном алгоритме поиска кратчайших путей, показали, что параллелизм сетей потока данных дает более высокую производительность программных реализаций на многоядерных процессорах по сравнению с параллелизмом разветвления/слияния стандарта OpenMP

    RDF : un modèle flot de données reconfigurable(version étendue)

    Get PDF
    Dataflow Models of Computation (MoCs) are widely used in embedded systems, including multimedia processing, digital signal processing, telecommunications, and automatic control. In a dataflow MoC, an application is specified as a graph of actors connected by FIFO channels. One of the most popular dataflow MoCs, Synchronous Dataflow (SDF), provides static analyses to guarantee boundedness and liveness, which are key properties for embedded systems. However, SDF (and most of its variants) lacks the capability to express the dynamism needed by modern streaming applications. In particular, the applications mentioned above have a strong need for reconfigurability to accommodate changes in the input data, the control objectives, or the environment.We address this need by proposing a new MoC called Reconfigurable Dataflow (RDF). RDF extends SDF with transformation rules that specify how the topology and actors of the graph may be reconfigured. Starting from an initial RDF graph and a set of transformation rules, an arbitrary number of new RDF graphs can be generated at runtime. A key feature of RDF is that it can be statically analyzed to guarantee that all possible graphs generated at runtime will be consistent and live. We introduce the RDF MoC, describe its associated static analyses, and outline its implementation.Les modèles de calcul (MoCs) flot de données synchrones sont très utilisés dans les systèmes embarqués pour les applications multimédia, de traitement du signal, de télécommunication et de contrôle automatique. Dans ce style de modèle, une application est spécifiée par un graphe d’acteurs connectés par des liens FIFO de communication. Un des MoCs les plus connus, SDF (pour Synchronous Dataflow), permet des analyses statiques qui garantissent l’exécution enmémoire bornée et l’absence d’interblocage, propriétés clés pour les systèmes embarqués. Néanmoins, SDF (et la plupart de ses variantes) ne permet pas d’exprimer la dynamicité requise par les applications embarquées modernes. En particulier, ces applications ont souvent besoin de se reconfigurer pour s’adapter aux changements (par ex., de débit ou de qualité) du flot d’entrée, des objectifs de contrôle ou de l’environnement.Afin de répondre à ce besoin, nous proposons le MoC RDF (pour Reconfigurable DataFlow) qui étend SDF avec des règles de transformations spécifiant comment la topologie et les acteurs du graphe peuvent être reconfigurés dynamiquement. En considérant un graphe SDF initial et un ensemble de règles de transformation, un nombre arbitraire de nouveaux graphes peuvent être produits. La principale qualité de RDF est qu’il peut être analysé statiquement pour garantir que tous les graphes générés dynamiquement s’exécuteront en mémoire bornée et sans interblocage.Nous présentons le modèle RDF, décrivons les analyses statiques associées et décrivons brièvementson implémentation

    Modèles de calculs flot de données avec paramètres entiers et booléens. Modélisation - Analyses - Mise en oeuvre

    Get PDF
    Streaming applications are responsible for the majority of the computation load in many embedded systems (video conferencing, computer vision etc). Their high performance requirements make parallel implementations a necessity. Hence, more and more modern embedded systems include many-core processors that allow massive parallelism. Parallel implementation of streaming applications on many-core platforms is challenging because of their complexity, which tends to increase, and their strict requirements both qualitative (e.g., robustness, reliability) and quantitative (e.g., throughput, power consumption). This is observed in the evolution of video codecs that keep increasing in complexity, while their performance requirements remain the same or even increase. Data flow models of computation (MoCs) have been developed to facilitate the design process of such applications, which are typically composed of filters exchanging streams of data via communication links. Data flow MoCs provide an intuitive representation of streaming applications, while exposing the available parallelism of the application. Moreover, they provide static analyses for liveness and boundedness. However, modern streaming applications feature filters that exchange variable amounts of data, and communication links that are not always active. In this thesis, we present a new data flow MoC, the Boolean Parametric Data Flow (BPDF), that allows parametrization of the amount of data exchanged between the filters using integer parameters and the enabling and disabling of communication links using boolean parameters. In this way, BPDF is able to capture more complex streaming applications, like video decoders. Despite the increase in expressiveness, BPDF applications remain statically analyzable for liveness and boundedness. However, increased expressiveness greatly complicates implementation. Integer parameters result in parametric data dependencies and the boolean parameters disable communication links, effectively removing data dependencies. We propose a scheduling framework that facilitates the scheduling of BPDF applications. Our scheduling framework produces as soon as possible schedules for a given static mapping. It takes us input scheduling constraints that derive either from the application (data dependencies) or from the user (schedule optimizations). The constraints are analyzed for liveness and, if possible, simplified. In this way, our framework provides flexibility, while guaranteeing the liveness of the application. Finally, calculation of the throughput of an application is important both at compile-time and at run-time. It allows to verify at compile-time that the application meets its performance requirements and it allows to take scheduling decisions at run-time that can improve performance or power consumption. We approach this problem by finding parametric throughput expressions for the maximum throughput of a subset of BPDF graphs. Finally, we provide an algorithm that calculates sufficient buffer sizes for the BPDF graph to operate at maximum throughput.Les applications de gestion de flux sont responsables de la majorité des calculs des systèmes embarqués (vidéo conférence, vision par ordinateur). Leurs exigences de haute performance rendent leur mise en œuvre parallèle nécessaire. Par conséquent, il est de plus en plus courant que les systèmes embarqués modernes incluent des processeurs multi-cœurs qui permettent un parallélisme massif. La mise en œuvre des applications de gestion de flux sur des multi-cœurs est difficile à cause de leur complexité, qui tend à augmenter, et de leurs exigences strictes à la fois qualitatives (robustesse, fiabilité) et quantitatives (débit, consommation d'énergie). Ceci est observé dans l'évolution de codecs vidéo qui ne cessent d'augmenter en complexité, tandis que leurs exigences de performance demeurent les mêmes. Les modèles de calcul (MdC) flot de données ont été développés pour faciliter la conception de ces applications qui sont typiquement composées de filtres qui échangent des flux de données via des liens de communication. Ces modèles fournissent une représentation intuitive des applications de gestion de flux, tout en exposant le parallélisme de tâches de l'application. En outre, ils fournissent des analyses statiques pour la vivacité et l'exécution en mémoire bornée. Cependant, les applications de gestion de flux modernes comportent des filtres qui échangent des quantités de données variables, et des liens de communication qui peuvent être activés / désactivés. Dans cette thèse, nous présentons un nouveau MdC flot de données, le Boolean Parametric Data Flow (BPDF), qui permet le paramétrage de la quantité de données échangées entre les filtres en utilisant des paramètres entiers et l'activation et la désactivation de liens de communication en utilisant des paramètres booléens. De cette manière, BPDF est capable de exprimer des applications plus complexes, comme les décodeurs vidéo modernes. Malgré l'augmentation de l'expressivité, les applications BPDF restent statiquement analysables pour la vivacité et l'exécution en mémoire bornée. Cependant, l'expressivité accrue complique grandement la mise en œuvre. Les paramètres entiers entraînent des dépendances de données de type paramétrique et les paramètres booléens peuvent désactiver des liens de communication et ainsi éliminer des dépendances de données. Pour cette raison, nous proposons un cadre d'ordonnancement qui produit des ordonnancements de type ``aussi tôt que possible'' (ASAP) pour un placement statique donné. Il utilise des contraintes d'ordonnancement, soit issues de l'application (dépendance de données) ou de l'utilisateur (optimisations d'ordonnancement). Les contraintes sont analysées pour la vivacité et, si possible, simplifiées. De cette façon, notre cadre permet une grande variété de politiques d'ordonnancement, tout en garantissant la vivacité de l'application. Enfin, le calcul du débit d'une application est important tant avant que pendant l'exécution. Il permet de vérifier que l'application satisfait ses exigences de performance et il permet de prendre des décisions d'ordonnancement à l'exécution qui peuvent améliorer la performance ou la consommation d'énergie. Nous traitons ce problème en trouvant des expressions paramétriques pour le débit maximum d'un sous-ensemble de BPDF. Enfin, nous proposons un algorithme qui calcule une taille des buffers suffisante pour que l'application BPDF ait un débit maximum

    Scheduling Optimisations for SPIN to Minimise Buffer Requirements in Synchronous Data Flow

    Get PDF
    Synchronous Data flow (SDF) graphs have a simple and elegant semantics (essentially linear algebra) which makes SDF graphs eminently suitable as a vehicle for studying scheduling optimisations. We extend related work on using SPIN to experiment with scheduling optimisations aimed at minimising buffer requirements.We show that for a benchmark of commonly used case studies the performance of our SPIN based scheduler is comparable to that of state of the art research tools. The key to success is using the semantics of SDF to prove when using (even unsound and/or incomplete) optimisations are justified. The main benefit of our approach lies in gaining deep insight in the optimisations at relatively low cost

    SPDF: A Schedulable Parametric Data-Flow MoC

    Get PDF
    International audienceDataflow programming models are suitable to express multi-core streaming applications. The design of high- quality embedded systems in that context requires static analysis to ensure the liveness and bounded memory of the application. However, many streaming applications have a dynamic behavior. The previously proposed dataflow models for dynamic applications do not provide any static guarantees or only in exchange of significant restrictions in expressive power or automation. To overcome these restrictions, we propose the schedulable parametric dataflow (SPDF) model. We present static analyses and a quasi-static scheduling algorithm. We demonstrate our approach using a video decoder case study

    SPDF: A Schedulable Parametric Data-Flow MoC (Extended Version)

    Get PDF
    Dataflow programming models are suitable to express multi-core streaming applications. The design of high-quality embedded systems in that context requires static analysis to ensure the liveness and bounded memory of the application. However, many streaming applications have a dynamic behavior. The previously proposed dataflow models for dynamic applications do not provide any static guarantees or only in exchange of significant restrictions in expressive power or automation. To overcome these restrictions, we propose the schedulable parametric dataflow (SPDF) model of computation. We present static analyses and a quasi-static scheduling algorithm. We demonstrate our approach using a video decoder case study
    corecore