27 research outputs found

    Solving graph coloring and SAT problems using field programmable gate arrays.

    Get PDF
    Chu-Keung Chung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 88-92).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation and Aims --- p.1Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Structure of the Thesis --- p.4Chapter 2 --- Literature Review --- p.6Chapter 2.1 --- Introduction --- p.6Chapter 2.2 --- Complete Algorithms --- p.7Chapter 2.2.1 --- Parallel Checking --- p.7Chapter 2.2.2 --- Mom's --- p.8Chapter 2.2.3 --- Davis-Putnam --- p.9Chapter 2.2.4 --- Nonchronological Backtracking --- p.9Chapter 2.2.5 --- Iterative Logic Array (ILA) --- p.10Chapter 2.3 --- Incomplete Algorithms --- p.11Chapter 2.3.1 --- GENET --- p.11Chapter 2.3.2 --- GSAT --- p.12Chapter 2.4 --- Summary --- p.13Chapter 3 --- Algorithms --- p.14Chapter 3.1 --- Introduction --- p.14Chapter 3.2 --- Tree Search Techniques --- p.14Chapter 3.2.1 --- Depth First Search --- p.15Chapter 3.2.2 --- Forward Checking --- p.16Chapter 3.2.3 --- Davis-Putnam --- p.17Chapter 3.2.4 --- GRASP --- p.19Chapter 3.3 --- Incomplete Algorithms --- p.20Chapter 3.3.1 --- GENET --- p.20Chapter 3.3.2 --- GSAT Algorithm --- p.22Chapter 3.4 --- Summary --- p.23Chapter 4 --- Field Programmable Gate Arrays --- p.24Chapter 4.1 --- Introduction --- p.24Chapter 4.2 --- FPGA --- p.24Chapter 4.2.1 --- Xilinx 4000 series FPGAs --- p.26Chapter 4.2.2 --- Bitstream --- p.31Chapter 4.3 --- Giga Operations Reconfigurable Computing Platform --- p.32Chapter 4.4 --- Annapolis Wildforce PCI board --- p.33Chapter 4.5 --- Summary --- p.35Chapter 5 --- Implementation --- p.36Chapter 5.1 --- Parallel Graph Coloring Machine --- p.36Chapter 5.1.1 --- System Architecture --- p.38Chapter 5.1.2 --- Evaluator --- p.39Chapter 5.1.3 --- Finite State Machine (FSM) --- p.42Chapter 5.1.4 --- Memory --- p.43Chapter 5.1.5 --- Hardware Resources --- p.43Chapter 5.2 --- Serial Graph Coloring Machine --- p.44Chapter 5.2.1 --- System Architecture --- p.44Chapter 5.2.2 --- Input Memory --- p.46Chapter 5.2.3 --- Solution Store --- p.46Chapter 5.2.4 --- Constraint Memory --- p.47Chapter 5.2.5 --- Evaluator --- p.48Chapter 5.2.6 --- Input Mapper --- p.49Chapter 5.2.7 --- Output Memory --- p.49Chapter 5.2.8 --- Backtrack Checker --- p.50Chapter 5.2.9 --- Word Generator --- p.51Chapter 5.2.10 --- State Machine --- p.51Chapter 5.2.11 --- Hardware Resources --- p.54Chapter 5.3 --- Serial Boolean Satisfiability Solver --- p.56Chapter 5.3.1 --- System Architecture --- p.58Chapter 5.3.2 --- Solutions --- p.59Chapter 5.3.3 --- Solution Generator --- p.59Chapter 5.3.4 --- Evaluator --- p.60Chapter 5.3.5 --- AND/OR --- p.62Chapter 5.3.6 --- State Machine --- p.62Chapter 5.3.7 --- Hardware Resources --- p.64Chapter 5.4 --- GSAT Solver --- p.65Chapter 5.4.1 --- System Architecture --- p.65Chapter 5.4.2 --- Variable Memory --- p.65Chapter 5.4.3 --- Flip-Bit Vector --- p.66Chapter 5.4.4 --- Clause Evaluator --- p.67Chapter 5.4.5 --- Adder --- p.70Chapter 5.4.6 --- Random Bit Generator --- p.71Chapter 5.4.7 --- Comparator --- p.71Chapter 5.4.8 --- Sum Register --- p.71Chapter 5.5 --- Summary --- p.71Chapter 6 --- Results --- p.73Chapter 6.1 --- Introduction --- p.73Chapter 6.2 --- Parallel Graph Coloring Machine --- p.73Chapter 6.3 --- Serial Graph Coloring Machine --- p.74Chapter 6.4 --- Serial SAT Solver --- p.74Chapter 6.5 --- GSAT Solver --- p.75Chapter 6.6 --- Summary --- p.76Chapter 7 --- Conclusion --- p.77Chapter 7.1 --- Future Work --- p.78Chapter A --- Software Implementation of Graph Coloring in CHIP --- p.79Chapter B --- Density Improvements Using Xilinx RAM --- p.81Chapter C --- Bit stream Configuration --- p.83Bibliography --- p.88Publications --- p.9

    Decomposition and encoding of finite state machines for FPGA implementation

    Get PDF
    xii+187hlm.;24c

    Exploiting development to enhance the scalability of hardware evolution.

    Get PDF
    Evolutionary algorithms do not scale well to the large, complex circuit design problems typical of the real world. Although techniques based on traditional design decomposition have been proposed to enhance hardware evolution's scalability, they often rely on traditional domain knowledge that may not be appropriate for evolutionary search and might limit evolution's opportunity to innovate. It has been proposed that reliance on such knowledge can be avoided by introducing a model of biological development to the evolutionary algorithm, but this approach has not yet achieved its potential. Prior demonstrations of how development can enhance scalability used toy problems that are not indicative of evolving hardware. Prior attempts to apply development to hardware evolution have rarely been successful and have never explored its effect on scalability in detail. This thesis demonstrates that development can enhance scalability in hardware evolution, primarily through a statistical comparison of hardware evolution's performance with and without development using circuit design problems of various sizes. This is reinforced by proposing and demonstrating three key mechanisms that development uses to enhance scalability: the creation of modules, the reuse of modules, and the discovery of design abstractions. The thesis includes several minor contributions: hardware is evolved using a common reconfigurable architecture at a lower level of abstraction than reported elsewhere. It is shown that this can allow evolution to exploit the architecture more efficiently and perhaps search more effectively. Also the benefits of several features of developmental models are explored through the biases they impose on the evolutionary search. Features that are explored include the type of environmental context development uses and the constraints on symmetry and information transmission they impose, genetic operators that may improve the robustness of gene networks, and how development is mapped to hardware. Also performance is compared against contemporary developmental models

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Quarterly technical progress report (Digital Computer Laboratory). January 1962

    Get PDF

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Design and Implementation of Hardware Accelerators for Neural Processing Applications

    Full text link
    Primary motivation for this work was the need to implement hardware accelerators for a newly proposed ANN structure called Auto Resonance Network (ARN) for robotic motion planning. ARN is an approximating feed-forward hierarchical and explainable network. It can be used in various AI applications but the application base was small. Therefore, the objective of the research was twofold: to develop a new application using ARN and to implement a hardware accelerator for ARN. As per the suggestions given by the Doctoral Committee, an image recognition system using ARN has been implemented. An accuracy of around 94% was achieved with only 2 layers of ARN. The network also required a small training data set of about 500 images. Publicly available MNIST dataset was used for this experiment. All the coding was done in Python. Massive parallelism seen in ANNs presents several challenges to CPU design. For a given functionality, e.g., multiplication, several copies of serial modules can be realized within the same area as a parallel module. Advantage of using serial modules compared to parallel modules under area constraints has been discussed. One of the module often useful in ANNs is a multi-operand addition. One problem in its implementation is that the estimation of carry bits when the number of operands changes. A theorem to calculate exact number of carry bits required for a multi-operand addition has been presented in the thesis which alleviates this problem. The main advantage of the modular approach to multi-operand addition is the possibility of pipelined addition with low reconfiguration overhead. This results in overall increase in throughput for large number of additions, typically seen in several DNN configurations

    The Customizable Virtual FPGA: Generation, System Integration and Configuration of Application-Specific Heterogeneous FPGA Architectures

    Get PDF
    In den vergangenen drei Jahrzehnten wurde die Entwicklung von Field Programmable Gate Arrays (FPGAs) stark von Moore’s Gesetz, Prozesstechnologie (Skalierung) und kommerziellen MĂ€rkten beeinflusst. State-of-the-Art FPGAs bewegen sich einerseits dem Allzweck nĂ€her, aber andererseits, da FPGAs immer mehr traditionelle DomĂ€nen der Anwendungsspezifischen integrierten Schaltungen (ASICs) ersetzt haben, steigen die Effizienzerwartungen. Mit dem Ende der Dennard-Skalierung können Effizienzsteigerungen nicht mehr auf Technologie-Skalierung allein zurĂŒckgreifen. Diese Facetten und Trends in Richtung rekonfigurierbarer System-on-Chips (SoCs) und neuen Low-Power-Anwendungen wie Cyber Physical Systems und Internet of Things erfordern eine bessere Anpassung der Ziel-FPGAs. Neben den Trends fĂŒr den Mainstream-Einsatz von FPGAs in Produkten des tĂ€glichen Bedarfs und Services wird es vor allem bei den jĂŒngsten Entwicklungen, FPGAs in Rechenzentren und Cloud-Services einzusetzen, notwendig sein, eine sofortige PortabilitĂ€t von Applikationen ĂŒber aktuelle und zukĂŒnftige FPGA-GerĂ€te hinweg zu gewĂ€hrleisten. In diesem Zusammenhang kann die Hardware-Virtualisierung ein nahtloses Mittel fĂŒr PlattformunabhĂ€ngigkeit und PortabilitĂ€t sein. Ehrlich gesagt stehen die Zwecke der Anpassung und der Virtualisierung eigentlich in einem Konfliktfeld, da die Anpassung fĂŒr die Effizienzsteigerung vorgesehen ist, wĂ€hrend jedoch die Virtualisierung zusĂ€tzlichen FlĂ€chenaufwand hinzufĂŒgt. Die Virtualisierung profitiert aber nicht nur von der Anpassung, sondern fĂŒgt auch mehr FlexibilitĂ€t hinzu, da die Architektur jederzeit verĂ€ndert werden kann. Diese Besonderheit kann fĂŒr adaptive Systeme ausgenutzt werden. Sowohl die Anpassung als auch die Virtualisierung von FPGA-Architekturen wurden in der Industrie bisher kaum adressiert. Trotz einiger existierenden akademischen Werke können diese Techniken noch als unerforscht betrachtet werden und sind aufstrebende Forschungsgebiete. Das Hauptziel dieser Arbeit ist die Generierung von FPGA-Architekturen, die auf eine effiziente Anpassung an die Applikation zugeschnitten sind. Im Gegensatz zum ĂŒblichen Ansatz mit kommerziellen FPGAs, bei denen die FPGA-Architektur als gegeben betrachtet wird und die Applikation auf die vorhandenen Ressourcen abgebildet wird, folgt diese Arbeit einem neuen Paradigma, in dem die Applikation oder Applikationsklasse fest steht und die Zielarchitektur auf die effiziente Anpassung an die Applikation zugeschnitten ist. Dies resultiert in angepassten anwendungsspezifischen FPGAs. Die drei SĂ€ulen dieser Arbeit sind die Aspekte der Virtualisierung, der Anpassung und des Frameworks. Das zentrale Element ist eine weitgehend parametrierbare virtuelle FPGA-Architektur, die V-FPGA genannt wird, wobei sie als primĂ€res Ziel auf jeden kommerziellen FPGA abgebildet werden kann, wĂ€hrend Anwendungen auf der virtuellen Schicht ausgefĂŒhrt werden. Dies sorgt fĂŒr PortabilitĂ€t und Migration auch auf Bitstream-Ebene, da die Spezifikation der virtuellen Schicht bestehen bleibt, wĂ€hrend die physische Plattform ausgetauscht werden kann. DarĂŒber hinaus wird diese Technik genutzt, um eine dynamische und partielle Rekonfiguration auf Plattformen zu ermöglichen, die sie nicht nativ unterstĂŒtzen. Neben der Virtualisierung soll die V-FPGA-Architektur auch als eingebettetes FPGA in ein ASIC integriert werden, das effiziente und dennoch flexible System-on-Chip-Lösungen bietet. Daher werden Zieltechnologie-Abbildungs-Methoden sowohl fĂŒr Virtualisierung als auch fĂŒr die physikalische Umsetzung adressiert und ein Beispiel fĂŒr die physikalische Umsetzung in einem 45 nm Standardzellen Ansatz aufgezeigt. Die hochflexible V-FPGA-Architektur kann mit mehr als 20 Parametern angepasst werden, darunter LUT-Grösse, Clustering, 3D-Stacking, Routing-Struktur und vieles mehr. Die Auswirkungen der Parameter auf FlĂ€che und Leistung der Architektur werden untersucht und eine umfangreiche Analyse von ĂŒber 1400 BenchmarklĂ€ufen zeigt eine hohe Parameterempfindlichkeit bei Abweichungen bis zu ±95, 9% in der FlĂ€che und ±78, 1% in der Leistung, was die hohe Bedeutung von Anpassung fĂŒr Effizienz aufzeigt. Um die Parameter systematisch an die BedĂŒrfnisse der Applikation anzupassen, wird eine parametrische Entwurfsraum-Explorationsmethode auf der Basis geeigneter FlĂ€chen- und Zeitmodellen vorgeschlagen. Eine Herausforderung von angepassten Architekturen ist der Entwurfsaufwand und die Notwendigkeit fĂŒr angepasste Werkzeuge. Daher umfasst diese Arbeit ein Framework fĂŒr die Architekturgenerierung, die Entwurfsraumexploration, die Anwendungsabbildung und die Evaluation. Vor allem ist der V-FPGA in einem vollstĂ€ndig synthetisierbaren generischen Very High Speed Integrated Circuit Hardware Description Language (VHDL) Code konzipiert, der sehr flexibel ist und die Notwendigkeit fĂŒr externe Codegeneratoren eliminiert. Systementwickler können von verschiedenen Arten von generischen SoC-Architekturvorlagen profitieren, um die Entwicklungszeit zu reduzieren. Alle notwendigen Konstruktionsschritte fĂŒr die Applikationsentwicklung und -abbildung auf den V-FPGA werden durch einen Tool-Flow fĂŒr Entwurfsautomatisierung unterstĂŒtzt, der eine Sammlung von vorhandenen kommerziellen und akademischen Werkzeugen ausnutzt, die durch geeignete Modelle angepasst und durch ein neues Werkzeug namens V-FPGA-Explorer ergĂ€nzt werden. Dieses neue Tool fungiert nicht nur als Back-End-Tool fĂŒr die Anwendungsabbildung auf dem V-FPGA sondern ist auch ein grafischer Konfigurations- und Layout-Editor, ein Bitstream-Generator, ein Architekturdatei-Generator fĂŒr die Place & Route Tools, ein Script-Generator und ein Testbenchgenerator. Eine Besonderheit ist die UnterstĂŒtzung der Just-in-Time-Kompilierung mit schnellen Algorithmen fĂŒr die In-System Anwendungsabbildung. Die Arbeit schliesst mit einigen AnwendungsfĂ€llen aus den Bereichen industrielle Prozessautomatisierung, medizinische Bildgebung, adaptive Systeme und Lehre ab, in denen der V-FPGA eingesetzt wird

    Control Theory in Engineering

    Get PDF
    The subject matter of this book ranges from new control design methods to control theory applications in electrical and mechanical engineering and computers. The book covers certain aspects of control theory, including new methodologies, techniques, and applications. It promotes control theory in practical applications of these engineering domains and shows the way to disseminate researchers’ contributions in the field. This project presents applications that improve the properties and performance of control systems in analysis and design using a higher technical level of scientific attainment. The authors have included worked examples and case studies resulting from their research in the field. Readers will benefit from new solutions and answers to questions related to the emerging realm of control theory in engineering applications and its implementation

    Algorithms in computer-aided design of VLSI circuits.

    Get PDF
    With the increased complexity of Very Large Scale Integrated (VLSI) circuits,Computer Aided Design (CAD) plays an even more important role. Top-downdesign methodology and layout of VLSI are reviewed. Moreover, previouslypublished algorithms in CAD of VLSI design are outlined.In certain applications, Reed-Muller (RM) forms when implemented withAND/XOR or OR/XNOR logic have shown some attractive advantages overthe standard Boolean logic based on AND/OR logic. The RM forms implementedwith OR/XNOR logic, known as Dual Forms of Reed-Muller (DFRM),is the Dual form of traditional RM implemented with AND /XOR.Map folding and transformation techniques are presented for the conversionbetween standard Boolean and DFRM expansions of any polarity. Bidirectionalmulti-segment computer based conversion algorithms are also proposedfor large functions based on the concept of Boolean polarity for canonicalproduct-of-sums Boolean functions. Furthermore, another two tabular basedconversion algorithms, serial and parallel tabular techniques, are presented forthe conversion of large functions between standard Boolean and DFRM expansionsof any polarity. The algorithms were tested for examples of up to 25variables using the MCNC and IWLS'93 benchmarks.Any n-variable Boolean function can be expressed by a Fixed PolarityReed-Muller (FPRM) form. In order to have a compact Multi-level MPRM(MMPRM) expansion, a method called on-set table method is developed.The method derives MMPRM expansions directly from FPRM expansions.If searching all polarities of FPRM expansions, the MMPRM expansions withthe least number of literals can be obtained. As a result, it is possible to findthe best polarity expansion among 2n FPRM expansions instead of searching2n2n-1 MPRM expansions within reasonable time for large functions. Furthermore,it uses on-set coefficients only and hence reduces the usage of memorydramatically.Currently, XOR and XNOR gates can be implemented into Look-Up Tables(LUT) of Field Programmable Gate Arrays (FPGAs). However, FPGAplacement is categorised to be NP-complete. Efficient placement algorithmsare very important to CAD design tools. Two algorithms based on GeneticAlgorithm (GA) and GA with Simulated Annealing (SA) are presented for theplacement of symmetrical FPGA. Both of algorithms could achieve comparableresults to those obtained by Versatile Placement and Routing (VPR) toolsin terms of the number of routing channel tracks
    corecore