18 research outputs found

    Instruction scheduling in micronet-based asynchronous ILP processors

    Get PDF

    On the Distribution of Control in Asynchronous Processor Architectures

    Get PDF
    Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure determined by the synergy between the processor architecture, the instruction set and the compiler. In the past, the sequencing of information within processor architectures has normally been synchronous: controlled centrally by a clock. However, this global signal could possibly limit the future gains in performance that can potentially be achieved through improvements in implementation technology. This thesis investigates the effects of relaxing this strict synchrony by distributing control within processor architectures through the use of a novel asynchronous design model known as a micronet. The impact of asynchronous control on the performance of a RISC-style processor is explored at different levels. Firstly, improvements in the performance of individual instructions by exploiting actual run-time behaviours are demonstrated. Secondly, it is shown that micronets are able to exploit further (both spatial and temporal) instructionlevel parallelism (ILP) efficiently through the distribution of control to datapath resources. Finally, exposing fine-grain concurrency within a datapath can only be of benefit to a computer system if it can easily be exploited by the compiler. Although compilers for micronet-based asynchronous processors may be considered to be more complex than their synchronous counterparts, it is shown that the variable execution time of an instruction does not adversely affect the compiler's ability to schedule code efficiently. In conclusion, the modelling of a processor's datapath as a micronet permits the exploitation of both finegrain ILP and actual run-time delays, thus leading to the efficient utilisation of functional units and in turn resulting in an improvement in overall system performance

    An integrated soft- and hard-programmable multithreaded architecture

    Get PDF

    A Network-based Asynchronous Architecture for Cryptographic Devices

    Get PDF
    Institute for Computing Systems ArchitectureThe traditional model of cryptography examines the security of the cipher as a mathematical function. However, ciphers that are secure when specified as mathematical functions are not necessarily secure in real-world implementations. The physical implementations of ciphers can be extremely difficult to control and often leak socalled side-channel information. Side-channel cryptanalysis attacks have shown to be especially effective as a practical means for attacking implementations of cryptographic algorithms on simple hardware platforms, such as smart-cards. Adversaries can obtain sensitive information from side-channels, such as the timing of operations, power consumption and electromagnetic emissions. Some of the attack techniques require surprisingly little side-channel information to break some of the best known ciphers. In constrained devices, such as smart-cards, straightforward implementations of cryptographic algorithms can be broken with minimal work. Preventing these attacks has become an active and a challenging area of research. Power analysis is a successful cryptanalytic technique that extracts secret information from cryptographic devices by analysing the power consumed during their operation. A particularly dangerous class of power analysis, differential power analysis (DPA), relies on the correlation of power consumption measurements. It has been proposed that adding non-determinism to the execution of the cryptographic device would reduce the danger of these attacks. It has also been demonstrated that asynchronous logic has advantages for security-sensitive applications. This thesis investigates the security and performance advantages of using a network-based asynchronous architecture, in which the functional units of the datapath form a network. Non-deterministic execution is achieved by exploiting concurrent execution of instructions both with and without data-dependencies; and by forwarding register values between instructions with data-dependencies using randomised routing over the network. The executions of cryptographic algorithms on different architectural configurations are simulated, and the obtained power traces are subjected to DPA attacks. The results show that the proposed architecture introduces a level of non-determinism in the execution that significantly raises the threshold for DPA attacks to succeed. In addition, the performance analysis shows that the improved security does not degrade performance

    Distributed computation in computer networks

    Get PDF
    None provide

    Static dependency analysis of recursive structures for parallelisation

    Get PDF

    Design of an asynchronous processor

    Get PDF

    Ordonnancement des instructions pour un processeur ARM endochrone

    Get PDF
    Les processeurs endochrones, par définition, utilisent des mécanismes locaux de synchronisation leur permettant de s’affranchir du maintien d’un signal d’horloge globale. Cette spécificité les rend moins énergivores comparativement aux processeurs synchrones. Toutefois, les processeurs endochrones sont moins populaires en raison du manque d’outils de design et de vérification ainsi que l’évolution rapide des processeurs synchrones en terme de performance. Ce mémoire s’inscrit dans le cadre du projet AnARM visant à développer un processeur à usage général ARM basé sur une architecture endochrone. Ce mémoire vise plus particuliè-rement l’exploration des méthodes d’ordonnancement des instructions pour développer une stratégie d’ordonnancement, basée sur les caractéristiques architecturales de l’AnARM, dans le but d’en améliorer les performances. L’ordonnancement des instructions est une optimisation du compilateur ayant un grand impact sur la qualité du code généré. Cette optimisation consiste à résoudre un problème NP-complet en tenant compte des contraintes imposées par l’architecture du processeur cible. Tandis que l’ordonnancement des instructions pour les architectures synchrone bénéficie d’une large couverture littéraire, l’ordonnancement pour les architectures asynchrones a été moins abordé, en raison des nouvelles contraintes imposées par les mécanismes de synchronisation utilisées par ces architectures. Ce mémoire présente l’élaboration, l’implémentation et l’évaluation d’une stratégie d’ordonnancement pour le processeur endochrone AnARM. La méthode d’ordonnancement présentée dans ce mémoire utilise un modèle d’ordonnancent dynamique basé sur le comportement spatio-temporel de l’AnARM. Cette méthode a été implémentée au sein d’un compilateur commercial moderne et évaluée comparativement à des méthodes d’ordonnancement usuelles. La méthode d’ordonnancement présentée dans ce mémoire engendre des améliorations de la performance allant de 6,22% à 17,48%, tout en préservant l’avantage énergétique de l’architecture endochrone à l’étude
    corecore