15 research outputs found
Parallelization of SAT on Reconfigurable Hardware
Quoique très difficile à résoudre, le problème de satisfiabilité Booléenne (SAT) est fréquemment utilisé lors de la modélisation d’applications industrielles. À cet effet, les deux dernières décennies ont vu une progression fulgurante des outils conçus pour trouver des solutions à ce problème NP-complet. Deux grandes avenues générales ont été explorées afin de produire ces outils, notamment l’approche logicielle et matérielle.
Afin de raffiner et améliorer ces solveurs, de nombreuses techniques et heuristiques ont été proposées par la communauté de recherche. Le but final de ces outils a été de résoudre des problèmes de taille industrielle, ce qui a été plus ou moins accompli par les solveurs de nature logicielle. Initialement, le but de l’utilisation du matériel reconfigurable a été de produire des solveurs pouvant trouver des solutions plus rapidement que leurs homologues logiciels. Cependant, le niveau de sophistication de ces derniers a augmenté de telle manière qu’ils restent le meilleur choix pour résoudre SAT. Toutefois, les solveurs modernes logiciels n’arrivent toujours pas a trouver des solutions de manière efficace à certaines instances SAT.
Le but principal de ce mémoire est d’explorer la résolution du problème SAT dans le contexte du matériel reconfigurable en vue de caractériser les ingrédients nécessaires d’un solveur SAT efficace qui puise sa puissance de calcul dans le parallélisme conféré par une plateforme FPGA. Le prototype parallèle implémenté dans ce travail est capable de se mesurer, en termes de vitesse d’exécution à d’autres solveurs (matériels et logiciels), et ce sans utiliser aucune heuristique. Nous montrons donc que notre approche matérielle présente une option prometteuse vers la résolution d’instances industrielles larges qui sont difficilement abordées par une approche logicielle.Though very difficult to solve, the Boolean satisfiability problem (SAT) is extensively used to model various real-world applications and problems. Over the past two decades, researchers have tried to provide tools that are used, to a certain degree, to find solutions to the Boolean satisfiability problem. The nature of these tools is broadly divided in software and reconfigurable hardware solvers. In addition, the main algorithms used to solve this problem have also been complemented with heuristics of various levels of sophistication to help overcome some of the NP-hardness of the problem. The end goal of these tools has been to provide solutions to industrial-sized problems of enormous size. Initially, reconfigurable hardware tools provided a promising avenue to accelerating SAT solving over traditional software based solutions. However, the level of sophistication of software solvers overcame their hardware counterparts, which remained limited to smaller problem instances. Even so, modern state-of-the-art software solvers still fail unpredictably on some instances.
The main focus of this thesis is to explore solving SAT on reconfigurable hardware in order to gain an understanding of what would be essential ingredients to add (and discard) to a very efficient hardware SAT solver that obtains its processing power from the raw parallelism of an FPGA platform. The parallel prototype solver that was implemented in this work has been found to be comparable with other hardware and software solvers in terms of execution speed even though no heuristics or other helping techniques were implemented. We thus show that our approach provides a very promising avenue to solving large, industrial SAT instances that might be difficult to handle by software solvers
Multilevel techniques and learning automata for the Maximum Satisfiability (MAXSAT) problem
The Maximum Satisfiability (MAXSAT) Problem is a propositional logic and an optimization
based problem that has great importance in the theoretical and practical domain. In
the recent years MAXSAT has risen great interest in the industry. Example problems from
the industry that can be encoded as MAXSAT problems are circuit design and debugging,
hardware verification, bioinformatics and scheduling. These kind of problems often tend
to be large and increase exponentially with the problem size, and therefore algorithms for
solving such problems incorporate different techniques and methods to solve the problems
in a smart and efficient manner.
In this thesis we introduce a range of algorithms that extend the well-known Stochastic
Local Search (SLS) algorithm called WalkSAT. WalkSAT is extended with the multilevel
paradigm and Learning Automata. The multilevel paradigm is a technique that splits large
and difficult problems into smaller problems. These problems are expectedly less complex
and therefore easier to solve. Learning Automata are a branch of machine learning that
can be seen as a decision-making entity that is employed in an unknown environment.
Through feedback from the environment the Learning Automata try to learn the optimal
actions.
The core of this thesis is the observations and findings of how these dissimilar techniques affect
the performance and behaviour of WalkSAT when solving industrial MAXSAT problem
instances. Through extensive experiments our results confirm that combining multilevel
techniques and Learning Automata with WalkSAT, separately and together, give promising
results. We compare these composite algorithms with WalkSAT on selected industrial
MAXSAT problems throughout the thesis, and show that all these composite algorithms
perform better than WalkSAT
A Reconfigurable Computing Solution to the Parameterized Vertex Cover Problem
Active research has been done in the past two decades in the field of computational intractability. This thesis explores parallel implementations on a RC (reconfigurable computing) platform for FPT (fixed-parameter tractable) algorithms.
Reconfigurable hardware implementations of algorithms for solving NP-Complete problems have been of great interest for research in the past few years. However, most of the research that has been done target exact algorithms for solving problems of this nature. Although such implementations have generated good results, it should be kept in mind that the input sizes were small. Moreover, most of these implementations are instance-specific in nature making it mandatory to generate a different circuit for every new problem instance.
In this work, we present an efficient and scalable algorithm that breaks out of the conventional instance-specific approach towards a more general parameterized approach to solve such problems. We present approaches based on the theory of fixed-parameter tractability. The prototype problem used as a case study here is the classic vertex cover problem. The hardware implementation has demonstrated speedups of the order of 100x over the software version of the vertex cover problem
Survey of FPGA applications in the period 2000 – 2015 (Technical Report)
Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs
Arquitecturas reconfiguráveis para problemas de optimização combinatória
Os problemas combinatórios têm uma gama extremamente ampla de
aplicações numa variedade de áreas de engenharia, incluindo teste de
circuitos electrónicos, reconhecimento de padrões, sÃntese lógica, etc. Muitos
dos problemas de interesse pertencem às classes NP-hard e NP-complete, o
que implica que os algoritmos relevantes têm no pior caso complexidade
exponencial. Este facto impede a solução de muitos problemas práticos com a
ajuda de computadores convencionais. As implementações em circuitos
integrados especÃficos também não são viáveis, em particular por causa da
própria heterogeneidade dos problemas combinatórios. Uma solução
alternativa consiste no uso de dispositivos reconfiguráveis que podem ser
personalizados para um algoritmo especÃfico e reutilizados para outros
algoritmos via uma simples reprogramação da sua estrutura interna. As
implementações baseadas em hardware reconfigurável permitem optimizar a
execução dos algoritmos relevantes com a ajuda de técnicas tais como
processamento paralelo, unidades funcionais personalizadas, etc. Tais
implementações possibilitam conter o efeito de crescimento exponencial do
tempo de computação, permitindo deste modo a solução de problemas
combinatórios complexos.
Recentemente foram desenvolvidos vários sistemas reconfiguráveis
destinados a resolver problemas combinatórios. Estes são principalmente
baseados na ideia de hardware especÃfico para a instância, em que para cada
instância do problema é gerado um circuito particular. Nesta tese exploramos
duas abordagens alternativas. A primeira é orientada para o domÃnio e permite
processar uma variedade de problemas da área da computação combinatória.
Para tal é projectado e implementado um processador combinatório
reconfigurável e são desenvolvidos métodos e ferramentas que asseguram a
sua reconfiguração dinâmica parcial. A segunda abordagem é orientada para a
aplicação e é destinada a resolver um problema combinatório especÃfico. Em
particular, é proposta uma arquitectura inovadora para a solução do problema
de satisfação booleana com a ajuda de uma combinação de software e de
hardware reconfigurável. A técnica adoptada elimina a compilação de
hardware especÃfica à instância e permite processar problemas que excedem
os recursos lógicos disponÃveis. São também exploradas as possibilidades de
implementação em hardware reconfigurável de estratégias evolutivas para o
caso do problema do caixeiro viajante.
Esta tese estende o domÃnio de aplicação da computação reconfigurável ao
demonstrar que esta é capaz de acelerar algoritmos com fluxos de controlo
complexos.Combinatorial problems have an extremely wide range of practical applications
in a variety of engineering areas, including the testing of electronic circuits,
pattern recognition, logic synthesis, etc. Many of the problems of interest
belong to the classes NP-hard and NP-complete, which implies that the
relevant algorithms have an exponential worst-case complexity. This fact
precludes the solution of many practical problems with conventional
computers. ASIC-based implementations are also not viable, in particular
because of the inherent heterogeneity of combinatorial problems.
Reconfigurable devices offer an alternative solution, which can be customized
to the requirements of a specific algorithm and reutilized for other algorithms
via a simple reprogramming of their internal structure. Implementations based
on reconfigurable hardware permit the execution of the relevant algorithms to
be optimized with the aid of such techniques as parallel processing,
personalized functional units, etc. Such implementations allow the effect of
exponential growth in the computation time to be delayed, thus enabling more
complex problem instances to be solved.
Recently, a few reconfigurable engines for combinatorial problems have been
developed. They are mainly based on the idea of instance-specific hardware,
which assumes that a particular circuit is generated for each problem instance.
In this thesis we explore two alternative approaches. The first, domain-specific,
approach enables a variety of problems in the area of combinatorial
computation to be addressed. For this purpose, a reconfigurable combinatorial
processor has been designed and implemented and a number of methods and
tools that support its partial dynamic reconfiguration have been developed. The
second, application-specific, approach is oriented towards solving individual
combinatorial problems. In particular, a novel architecture is proposed for
solving the Boolean satisfiability problem with the aid of software and
reconfigurable hardware. The adopted technique avoids instance-specific
hardware compilation and permits problems that exceed the available logic
resources to be solved. The possibility of implementing evolutionary strategies
for the traveling salesman problem in reconfigurable hardware is also explored.
This thesis extends the application domain of reconfigurable computing by
demonstrating that it is effective in accelerating algorithms with complex control
flows
Hybrid FPGA: Architecture and Interface
Hybrid FPGAs (Field Programmable Gate Arrays) are composed of general-purpose logic resources
with different granularities, together with domain-specific coarse-grained units. This thesis proposes
a novel hybrid FPGA architecture with embedded coarse-grained Floating Point Units (FPUs) to
improve the floating point capability of FPGAs. Based on the proposed hybrid FPGA architecture,
we examine three aspects to optimise the speed and area for domain-specific applications.
First, we examine the interface between large coarse-grained embedded blocks (EBs) and fine-grained
elements in hybrid FPGAs. The interface includes parameters for varying: (1) aspect ratio of EBs,
(2) position of the EBs in the FPGA, (3) I/O pins arrangement of EBs, (4) interconnect flexibility of
EBs, and (5) location of additional embedded elements such as memory.
Second, we examine the interconnect structure for hybrid FPGAs. We investigate how large and highdensity
EBs affect the routing demand for hybrid FPGAs over a set of domain-specific applications.
We then propose three routing optimisation methods to meet the additional routing demand introduced
by large EBs: (1) identifying the best separation distance between EBs, (2) adding routing switches on
EBs to increase routing flexibility, and (3) introducing wider channel width near the edge of EBs. We
study and compare the trade-offs in delay, area and routability of these three optimisation methods.
Finally, we employ common subgraph extraction to determine the number of floating point adders/subtractors,
multipliers and wordblocks in the FPUs. The wordblocks include registers and can implement fixed
point operations. We study the area, speed and utilisation trade-offs of the selected FPU subgraphs
in a set of floating point benchmark circuits. We develop an optimised coarse-grained FPU, taking
into account both architectural and system-level issues. Furthermore, we investigate the trade-offs
between granularities and performance by composing small FPUs into a large FPU.
The results of this thesis would help design a domain-specific hybrid FPGA to meet user requirements,
by optimising for speed, area or a combination of speed and area
Accelerating Reconfigurable Financial Computing
This thesis proposes novel approaches to the design, optimisation, and management of reconfigurable
computer accelerators for financial computing. There are three contributions. First, we propose novel
reconfigurable designs for derivative pricing using both Monte-Carlo and quadrature methods. Such
designs involve exploring techniques such as control variate optimisation for Monte-Carlo, and multi-dimensional
analysis for quadrature methods. Significant speedups and energy savings are achieved
using our Field-Programmable Gate Array (FPGA) designs over both Central Processing Unit (CPU)
and Graphical Processing Unit (GPU) designs. Second, we propose a framework for distributing computing
tasks on multi-accelerator heterogeneous clusters. In this framework, different computational
devices including FPGAs, GPUs and CPUs work collaboratively on the same financial problem based
on a dynamic scheduling policy. The trade-off in speed and in energy consumption of different accelerator
allocations is investigated. Third, we propose a mixed precision methodology for optimising
Monte-Carlo designs, and a reduced precision methodology for optimising quadrature designs. These
methodologies enable us to optimise throughput of reconfigurable designs by using datapaths with
minimised precision, while maintaining the same accuracy of the results as in the original designs
Commande algorithmique d'un système mono-onduleur bimachine asynchrone destiné à la traction ferroviaire
L'objectif de ce travail est de modéliser et de caractériser le comportement d’un système mono-onduleur bimachine asynchrone appliqué à la traction ferroviaire (bogie d’une locomotive BB 36000) en vue de la conception de sa commande. La première partie de ce mémoire est consacrée à la description détaillée du système globale. L’analyse de l’influence des perturbations internes (variations paramétriques d’une machine) et externes (décollement du pantographe, patinage, broutement) de ce système est faite dans le cas d’une structure de commande vectorielle classique appliqué à chaque moteur de l'entraînement (structure de traction classique). Dans la deuxième partie, une nouvelle structure de propulsion est proposée. Elle est constituée d’un onduleur à deux niveaux de tension à modulation de largeur d’impulsion, alimentant en parallèle les deux machines asynchrones qui à leur tour permettent de créer la force de traction transmise aux essieux du bogie. La caisse de la locomotive représente une charge commune pour les deux moteurs. Plusieurs stratégies de commande coopérative sont étudiés, il s’agit des commandes : moyenne simple, moyenne double, maître - esclave alternée et moyenne différentielle. Une stratégie d’observation des modes électriques adéquate à ces différentes contrôles est étudie par la suite. Cet ensemble de commandes est validé à l’aide d’un logiciel de simulation SABER. Il correspond à une quasi-expérimentation dans la mesure où le système à contrôler est modélisé en langage MAST et toute la commande discrète en langage C dans l'environnement de SABER. La troisième partie est dédiée à la suppression du capteur mécanique ensuite appliqué dans les commandes coopératives précédemment proposées. Les méthodes partielles de reconstruction de la vitesse sont : la rélation d’autopilotage, le filtre de Kalman mécanique, l’observateur à structure variable et MRAS. Finalement la description de la configuration matérielle pour la réalisation expérimentale est présentée. ABSTRACT : The goal of this work concerns the modelling and the behaviour characterisation of a single inverter dual induction motor system applied to a railway traction bogie (BB36000) in order to concept its control. First part of this job is dedicated to the detailed description of overall system. The influence analysis of the internal perturbations (motor parameters variation) and, external perturbations (pantograph detachment, adherence loss, stick-slip) of the system have made considering the field oriented control applied to each motor of the bogie (classical traction structure). In a second part, a novel propulsion structure is proposed. It is composed by a single pulsewidth modulated two level voltage source inverter. It supplies two parallel connected induction motors, which generate the transmitted traction force to the bogie wheels. The locomotive case represents the common load for the two motors. Several co-operative control strategies (CS) are studied. They are: the mean CS, the double mean CS, the master – slave switched CS and, the mean differential CS. In addition, an appropriated electric modes observer structure for these different controls has studied. These controls have validated applying the perturbations to the models using the solver SABER. This special approach is equivalent to quasi-experimentation, because the mechanical and the electrical system components have modelled using MAST language and, the sample control has created by a C code programme in the SABER environment. Third part is dedicated to the mechanical sensor suppression and, its adaptation in the cooperative control strategies. The partial speed reconstruction methods are : the fundamental frequency relation, the mechanical Kalman filter, the variable structure observer and the MRAS. Finally, the hardware system configuration of the experimental realisation is described