9 research outputs found

    Multi-processor system design with ESPAM

    Get PDF
    ABSTRAC

    Programming Models and Tools for Intelligent Embedded Systems

    Get PDF

    Conception de processeurs spécialisés pour le traitement vidéo en temps réel par filtre local

    Get PDF
    RÉSUMÉ Ce mémoire décrit les travaux visant à explorer les possibilités qu'offrent les processeurs à jeu d'instructions spécialisé pour des applications de vidéo numérique. Spécifiquement une classe particulière d'algorithmes de traitement vidéo est considérée: les filtres locaux. Pour cette classe d'algorithmes, une exploration architecturale a permis d'identifier un ensemble de techniques formant une approche cohérente et systématique pour la conception de processeurs spécialisés performants adaptés au traitement vidéo en temps réel. L'approche de conception proposée vise une utilisation efficace de la bande passante vers la mémoire, laquelle bande passante constitue le goulot d'étranglement de l'application du point de vue de la vitesse de traitement. Il est possible d'approcher la performance limite imposée par ce goulot par une stratégie appropriée de réutilisation des données et en exploitant le parallélisme des données inhérent à la classe d'algorithmes visée. L'approche comporte quatre étapes: tout d'abord, une instruction parallèle (SIMD) qui effectue le calcul de plusieurs pixels de sortie à la fois est créée. Puis, des registres à décalage permettant la réutilisation intra-ligne des pixels d'entrée sont ajoutés. Ensuite, un pipeline est créé par le découpage de l'instruction parallèle et l'ajout de registres pour les résultats intermédiaires. Finalement, les instructions spécialisées de chargement et de sauvegarde sont créées. Quelques-unes de ces étapes ouvrent la porte à des simplifications matérielles spécifiques pour certains algorithmes de la classe cible. La structure matérielle obtenue au final, alliée à la parallélisation des instructions par l'utilisation d'une architecture VLIW, se comporte d'une manière semblable à un réseau systolique pipeliné. Afin de démontrer expérimentalement la validité de l'approche de conception proposée, sept processeurs spécialisés pour des algorithmes de la classe visée ont été conçus par extension du jeu d'instructions d'un processeur configurable à jeu d'instructions extensible. Trois de ces processeurs spécialisés mettent en œuvre autant d'algorithmes de désentrelacement intra-trames, et quatre visent plutôt la convolution 2D, différant entre eux par la taille de la fenêtre de convolution. Les résultats de performance obtenus sont prometteurs. Pour les algorithmes de désentrelacement intra-trames, les facteurs d'accélération varient entre 95 et 1330, alors que les facteurs d'amélioration du produit temps-surface varient entre 29 et 243, tout ceci par rapport à un processeur d'usage général de référence roulant une implémentation purement logicielle de l'algorithme.----------ABSTRACT This master thesis explores the possibilities offered by Application-Specific Instruction-Set Processors (ASIP) for digital video applications, more specifically for a particular algorithm class used for video processing: local neighbourhood functions. For this algorithm class, an architectural exploration lead to the identification of a set of design techniques which, together, form a coherent and systematic approach for the design of high performance ASIPs usable for real-time video processing. The proposed design approach aims at an efficient utilization of available bandwidth to memory, which constitutes the main performance bottleneck of the application. It is possible to approach the processing speed limit imposed by this bottleneck through an appropriate data reuse strategy and by exploiting the data parallelism inherent to the target algorithm class. The design approach comprises four steps: first, a Single Instruction Multiple Data (SIMD) instruction which calculates more than one pixel in parallel is created. Then, shift registers, which are used for intra-line input pixel reuse, are added. Next, a processing pipeline is created by the addition of application-specific registers. Finally, the custom load/store instructions are created. Some of these steps lead to possible hardware simplifications for some algorithms of the target class. The hardware structure thus obtained, together with the instruction-level parallelism made possible through the use of a Very Long Instruction Word (VLIW) architecture, mimics a pipelined systolic array. In order to demonstrate the validity of the proposed design approach experimentally, seven ASIPs have been designed by extending the instruction-set of a configurable and extensible processor. Three of the ASIPs implement intra-field deinterlacing algorithms, and four implement the 2D convolution with different kernel sizes. The results show a significant improvement in performance. For the intra-field deinterlacing algorithms, speedup factors are between 95 and 1330, while the factors of improvement of the Area-Time (AT) product are between 29 and 243, all this compared to a pure software implementation running on a general-purpose processor. In the case of the two-dimensional convolution, speedup factors are between 36 and 80, while factors of improvement of the AT product are between 12 and 22. In all cases, real-time processing of high definition video in the 1080i (deinterlacing) or 1080p (convolution) format is possible given a 130 nm manufacturing process

    An illumination of the template enigma : software code generation with templates

    Get PDF
    Creating software is a process of refining a concept to an implementation. This process consists of several stages represented by documents, models and plans at several levels of abstraction. Mostly, the refinement process requires creativity of the programmers, but sometimes the task is boring and repetitive. This repetitive work is an indication that the program is not written at the most suitable level of abstraction. The level of abstraction offered by the used programming language might be too low to remove the recurring code. Code generators can be used to raise the level of abstraction of program specifications and to automate the repetitive work. This thesis focuses on code generators based on templates. Templates are one of the techniques to implement a code generator. Templates allow extension of the syntax of a programming language, enabling generative programming without modifying the underlying compiler. Four artifacts are involved in a template based generator: templates, input data, a template evaluator and output code. The templates we consider are a concrete (incomplete) representation of the output document, i.e. object code, that contains holes, i.e. the meta code. These holes are filled by the template evaluator using information from the input data to obtain the output code. Templates are widely used to generate HTML code in web applications. They can be used for generating all kinds of text, like e-mails or (source) code. In this thesis we limit the scope to the generation of source code. The central research question is how the quality of template based code generators can be improved. Quality, in general, is a broad notion and our scope is limited to the technical quality of templates and generated code. We focused on improving the maintainability of template based code generators and the correctness of the generated code. This is facilitated by the three main contributions provided by this thesis. First, the maintainability of template based code generators is increased by specifying the following requirement for our metalanguage. Our metalanguage should not be rich enough to allow programming in templates, without being too restrictive to express some code generators. We used the theory of formal languages to specify our metalanguage. Second, we ensure correctness of the templates and generated code. Third, the presented theory and techniques are validated by case studies. These case studies show application of templates in real world applications, increased maintainability and syntactical correctness of generated code. Our metalanguage should not be rich enough to allow programming in templates, without being too restrictive to express some code generators. The theory of formal languages is used to specify the requirements for our metalanguage. As we only consider to generate programming languages, it is sufficient to support the generation of languages defined by context-free grammars. This assumption is used to derive a metalanguage, that is rich enough to specify code generators that are able to instantiate all possible sentences of a context-free language. A specific case of a code generator, the unparser, is a program that can instantiate all sentences of a context-free language. We proved that an unparser can be implemented using a linear deterministic topdown tree-to-string transducer. We call this property unparser-completeness. Our metalanguage is based on a linear deterministic top-down tree-to-string transducer. Recall that the goal of specifying the requirements of the metalanguage is to increase the maintainability of template based code generators, without being too restrictive. To validate that our metalanguage is not too restrictive and leads to better maintainable templates, we compared it with four off-the-shelf text template systems by implementing an unparser. We have observed that the industrial template evaluators provide a Turing complete metalanguage, but they do not contain a block scoping mechanism for the meta-variables. This results in undesired additional boilerplate meta code in their templates. The second contribution is guaranteeing the correctness of the generated code. Correctness of the generated code can be divided in two concerns: syntactical correctness and semantical correctness. We start with syntactical correctness of the generated code. The use of text templates implies that syntactical correctness of the generated code can only be detected at compilation time. This means that errors detected during the compilation are reported on the level of the generated code. The developer is required to trace back manually the errors to their origin in the template or input data. We believe that programs manipulating source code should not consider the object code as text to detect errors as early as possible. We present an approach where the grammars of the object language and metalanguage can be combined in a modular way. Combining both grammars allows parsing both languages simultaneously. Syntax errors in both languages of the template will be found while parsing it. Moreover, only parsing a template is not sufficient to ensure that the generated code will be free of syntax errors. The template evaluator must be equipped with a mechanism to guarantee its output will be syntactically correct. We discuss our mechanism in short. A parse tree is constructed during the parsing of the template. This tree contains subtrees for the object code and subtrees for the meta code. While evaluating the template, subtrees of the meta code are substituted by object code subtrees. The template evaluator checks whether the root nonterminal of the object code subtree is equal to the root nonterminal of the meta code subtree. When both are equal, it is allowed to substitute the meta code. When the root nonterminals are distinct an accurate error message is generated. The template evaluator terminates when all meta code subtrees are substituted. The result is a parse tree of the object language and thus syntactically correct. We call this process syntax safe code generation. In order to validate that the presented techniques increase maintainability and ensure syntactical correctness, we implemented our ideas in a syntax safe template evaluator called Repleo. Repleo has been applied in four case studies. The first case is a real world situation, where it is required to generate a three tier web application from a data model. This case showed that multiple layers of an applications defined in different programming languages can be generated from a single model. The second case and third case are used to show that our metalanguage results in a better maintainable code generator. Our metalanguage forces to use a two layer code generator with separation of concerns between the two layers, where the original implementations are less modular. The last case study shows that ensuring syntactical correctness results in the prevention of cross-site scripting attacks in dynamic generation of web pages. Recall that one of our goals was ensuring the correctness of the generated code. We also showed that is possible to check static semantic properties of templates. Static semantic checks are defined for the metalanguage, for the object language and checks for the situations where the object language is dependent on the metalanguage. We implemented a prototype of a static semantic checker for PicoJava templates using attribute grammars. The use of attribute grammars leads to re-use of the original PicoJava checker. Summarizing, in this thesis we have formulated the requirements for a metalanguage and discussed how to implement a syntax safe template evaluator. This results in better maintainable template based code generators and more reliable generated code

    Synthesis of a parallel data stream processor from data flow process networks

    Get PDF
    In this talk, we address the problem of synthesizing Process Network specifications to FPGA execution platforms. The process networks we consider are special cases of Kahn Process Networks. We call them COMPAAN Data Flow Process Networks (CDFPN) because they are provided by a translator called the COMPAAN compiler that automatically translates affine nested loop programs to input-output equivalent (COMPAAN) process network specifications. The objective is to provide an effective and efficient implementation of CDFPNs in an FPGA execution platform, where our implementation is close to a one-to-one mapping of the originating CDFPN. The execution platform emerges as part of the mapping process resulting in a dedicated multi-processor execution platform for a given CDFPN specification.LIACSUBL - phd migration 201

    System-Level Abstraction Semantics

    No full text
    Raising the level of abstraction is widely seen as the solution for closing the productivity gap in system design. They key for the success of this approach, however, are well-defined abstraction levels and models. In this paper, we present such system level semantics to cover the system design process. We define properties and features of each model. Formalization of the flow enables design automation for synthesis and verification to achieve the required productivity gains. Through customization, the semantics allow creation of specific design methodologies. We applied the concepts to system languages SystemC and SpecC. Using the example of a JPEG encoder, we will demonstrate the feasibility and e#ectiveness of the approach

    ABSTRACT System-Level Abstraction Semantics

    No full text
    Raising the level of abstraction is widely seen as the solution for closing the productivity gap in system design. They key for the success of this approach, however, are well-defined abstraction levels and models. In this paper, we present such system level semantics to cover the system design process. We define properties and features of each model. Formalization of the flow enables design automation for synthesis and verification to achieve the required productivity gains. Through customization, the semantics allow creation of specific design methodologies. We applied the concepts to system languages SystemC and SpecC. Using the example of a JPEG encoder, we will demonstrate the feasibility and effectiveness of the approach
    corecore