18 research outputs found
On the Distribution of Control in Asynchronous Processor Architectures
Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure
determined by the synergy between the processor architecture, the
instruction set and the compiler. In the past, the sequencing of
information within processor architectures has normally been
synchronous: controlled centrally by a clock. However, this global
signal could possibly limit the future gains in performance that can
potentially be achieved through improvements in implementation
technology.
This thesis investigates the effects of relaxing this strict synchrony
by distributing control within processor architectures through the use
of a novel asynchronous design model known as a micronet. The impact
of asynchronous control on the performance of a RISC-style processor
is explored at different levels. Firstly, improvements in the
performance of individual instructions by exploiting actual run-time
behaviours are demonstrated. Secondly, it is shown that micronets are
able to exploit further (both spatial and temporal) instructionlevel
parallelism (ILP) efficiently through the distribution of control to
datapath resources. Finally, exposing fine-grain concurrency within a
datapath can only be of benefit to a computer system if it can easily
be exploited by the compiler. Although compilers for micronet-based
asynchronous processors may be considered to be more complex than
their synchronous counterparts, it is shown that the variable
execution time of an instruction does not adversely affect the
compiler's ability to schedule code efficiently. In conclusion, the
modelling of a processor's datapath as a micronet permits the
exploitation of both finegrain ILP and actual run-time delays, thus
leading to the efficient utilisation of functional units and in turn
resulting in an improvement in overall system performance
A Network-based Asynchronous Architecture for Cryptographic Devices
Institute for Computing Systems ArchitectureThe traditional model of cryptography examines the security of the cipher as a
mathematical function. However, ciphers that are secure when specified as mathematical
functions are not necessarily secure in real-world implementations. The physical
implementations of ciphers can be extremely difficult to control and often leak socalled
side-channel information. Side-channel cryptanalysis attacks have shown to
be especially effective as a practical means for attacking implementations of cryptographic
algorithms on simple hardware platforms, such as smart-cards. Adversaries
can obtain sensitive information from side-channels, such as the timing of operations,
power consumption and electromagnetic emissions. Some of the attack techniques
require surprisingly little side-channel information to break some of the best known
ciphers. In constrained devices, such as smart-cards, straightforward implementations
of cryptographic algorithms can be broken with minimal work. Preventing these attacks
has become an active and a challenging area of research.
Power analysis is a successful cryptanalytic technique that extracts secret information
from cryptographic devices by analysing the power consumed during their operation.
A particularly dangerous class of power analysis, differential power analysis
(DPA), relies on the correlation of power consumption measurements. It has been proposed
that adding non-determinism to the execution of the cryptographic device would
reduce the danger of these attacks. It has also been demonstrated that asynchronous
logic has advantages for security-sensitive applications. This thesis investigates the
security and performance advantages of using a network-based asynchronous architecture,
in which the functional units of the datapath form a network. Non-deterministic
execution is achieved by exploiting concurrent execution of instructions both with and
without data-dependencies; and by forwarding register values between instructions
with data-dependencies using randomised routing over the network. The executions of
cryptographic algorithms on different architectural configurations are simulated, and
the obtained power traces are subjected to DPA attacks. The results show that the
proposed architecture introduces a level of non-determinism in the execution that significantly
raises the threshold for DPA attacks to succeed. In addition, the performance
analysis shows that the improved security does not degrade performance
Ordonnancement des instructions pour un processeur ARM endochrone
Les processeurs endochrones, par définition, utilisent des mécanismes locaux de synchronisation leur permettant de s’affranchir du maintien d’un signal d’horloge globale. Cette spécificité les rend moins énergivores comparativement aux processeurs synchrones. Toutefois, les processeurs endochrones sont moins populaires en raison du manque d’outils de design et de vérification ainsi que l’évolution rapide des processeurs synchrones en terme de performance.
Ce mémoire s’inscrit dans le cadre du projet AnARM visant à développer un processeur à usage général ARM basé sur une architecture endochrone. Ce mémoire vise plus particuliè-rement l’exploration des méthodes d’ordonnancement des instructions pour développer une stratégie d’ordonnancement, basée sur les caractéristiques architecturales de l’AnARM, dans le but d’en améliorer les performances.
L’ordonnancement des instructions est une optimisation du compilateur ayant un grand impact sur la qualité du code généré. Cette optimisation consiste à résoudre un problème NP-complet en tenant compte des contraintes imposées par l’architecture du processeur cible. Tandis que l’ordonnancement des instructions pour les architectures synchrone bénéficie d’une large couverture littéraire, l’ordonnancement pour les architectures asynchrones a été moins abordé, en raison des nouvelles contraintes imposées par les mécanismes de synchronisation utilisées par ces architectures.
Ce mémoire présente l’élaboration, l’implémentation et l’évaluation d’une stratégie d’ordonnancement pour le processeur endochrone AnARM. La méthode d’ordonnancement présentée dans ce mémoire utilise un modèle d’ordonnancent dynamique basé sur le comportement spatio-temporel de l’AnARM. Cette méthode a été implémentée au sein d’un compilateur commercial moderne et évaluée comparativement à des méthodes d’ordonnancement usuelles. La méthode d’ordonnancement présentée dans ce mémoire engendre des améliorations de la performance allant de 6,22% à 17,48%, tout en préservant l’avantage énergétique de l’architecture endochrone à l’étude