323 research outputs found
Introduction to Logic Circuits & Logic Design with VHDL
The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over
the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits
were designed using classical techniques. Classical techniques relied heavily on manual design
practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design
style, academic textbooks were developed that taught classical digital design techniques. Around 1990,
large-scale digital systems began being designed using hardware description languages (HDL) and
automated synthesis tools. Broad-scale adoption of this modern design approach spread through the
industry during this decade. Around 2000, hardware description languages and the modern digital
design approach began to be taught in universities, mainly at the senior and graduate level. There
were a variety of reasons that the modern digital design approach did not penetrate the lower levels of
academia during this time. First, the design and simulation tools were difficult to use and overwhelmed
freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting
was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which
are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid
advances in programmable logic and design tools allowed the modern digital design approach to be
implemented in a university setting, even in lower-level courses. This allowed students to learn the
modern design approach based on HDLs and prototype their designs in real hardware, mainly field
programmable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored teaching
hardware description languages and higher levels of design abstraction. This trend has continued until
today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only
the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital
circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain
sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach
the modern digital design approach move immediately into high-level behavioral modeling with minimal
or no coverage of the underlying hardware used to implement the systems. As a result, students are not
being provided the resources to understand the fundamental hardware theory that lies beneath the
modern abstraction such as interfacing, gate-level implementation, and technology optimization.
Students moving too rapidly into high levels of abstraction have little understanding of what is going
on when they click the “compile and synthesize” button of their design tool. This leads to graduates who
can model a breadth of different systems in an HDL but have no depth into how the system is
implemented in hardware. This becomes problematic when an issue arises in a real design and there
is no foundational knowledge for the students to fall back on in order to debug the problem
Introduction to Logic Circuits & Logic Design with Verilog
The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over
the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits
were designed using classical techniques. Classical techniques relied heavily on manual design
practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design
style, academic textbooks were developed that taught classical digital design techniques. Around 1990,
large-scale digital systems began being designed using hardware description languages (HDL) and
automated synthesis tools. Broad-scale adoption of this modern design approach spread through the
industry during this decade. Around 2000, hardware description languages and the modern digital
design approach began to be taught in universities, mainly at the senior and graduate level. There
were a variety of reasons that the modern digital design approach did not penetrate the lower levels of
academia during this time. First, the design and simulation tools were difficult to use and overwhelmed
freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting
was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which
are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid
advances in programmable logic and design tools allowed the modern digital design approach to be
implemented in a university setting, even in lower-level courses. This allowed students to learn the
modern design approach based on HDLs and prototype their designs in real hardware, mainly fieldprogrammable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored, teaching
hardware description languages and higher levels of design abstraction. This trend has continued until
today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only
the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital
circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain
sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach
the modern digital design approach move immediately into high-level behavioral modeling with minimal
or no coverage of the underlying hardware used to implement the systems. As a result, students are not
being provided the resources to understand the fundamental hardware theory that lies beneath the
modern abstraction such as interfacing, gate-level implementation, and technology optimization.
Students moving too rapidly into high levels of abstraction have little understanding of what is going
on when they click the “compile and synthesize” button of their design tool. This leads to graduates who
can model a breadth of different systems in an HDL but have no depth into how the system is
implemented in hardware. This becomes problematic when an issue arises in a real design and there
is no foundational knowledge for the students to fall back on in order to debug the problem
Comparison of the tally numbering system to traditional arithmetic systems in field programmable gate arrays
This research explores the use of heterogeneous computing platforms for use in machine learning as well as different neural network architectures. These platforms and architectures can be used to accelerate the complex operations that are required for machine learning, more specifically neural networks. The use of different architectures, implementing different types of numbering and mathematics systems is explored in hopes of accelerating mathematical functions. The heterogeneous computing platform explored in this thesis is a Field Programmable Gate Arrays (FPGA), specifically a SoC/FPGA which is a ARM CPU and a FPGA in the same chip. FPGAs are unique because they are low power, high customizable hardware with bit level control. A new numbering system, called Tally, can be simulated in software but only implemented in hardware with a FPGA, without time consuming and expensive ASIC (application specific integrated circuit) being designed. This thesis explores two different types of neural networks are explored the first a simple XOR (exclusive OR) gate neural network tested in the Tally system, 16-bit fixed point and 32-bit floating point. The MNIST (handwritten numbers) dataset is used with a pre-trained multi-layer perceptron with both 16-bit Fixed Point and 32-bit Floating Point numbers. This study will act as a preliminary exploration of the Tally System and an exercise in learning implementation of neural networks in hardware
Reconfigurable architectures for the next generation of mobile device telecommunications systems
Mobile devices have become a dominant tool in our daily lives. Business and
personal usage has escalated tremendously since the emergence of smartphones
and tablets. The combination of powerful processing in mobile devices, such as
smartphones and the Internet, have established a new era for communications
systems. This has put further pressure on the performance and efficiency of
telecommunications systems in delivering the aspirations of users. Mobile device
users no longer want devices that merely perform phone calls and messaging.
Rather, they look for further interactive applications such as video streaming,
navigation and real time social interaction. Such applications require a new set of
hardware and standards. The WiFi (IEEE 802.11) standard has been at the forefront
of reliable and high-speed internet access telecommunications. This is due to its
high signal quality (quality of service) and speed (throughput). However, its limited
availability and short range highlights the need for further protocols, in particular
when far away from access points or base stations. This led to the emergence of 3G
followed by 4G and the upcoming 5G standard that, if fully realised, will provide
another dimension in “anywhere, anytime internet connectivity.” On the other
hand, the WiMAX (IEEE 802.16) standard promises to exceed the WiFi signal
coverage range. The coverage range could be extended to kilometres at least with a
better or similar WiFi signal level.
This thesis considers a dynamically reconfigurable architecture that is capable of
processing various modules within telecommunications systems. Forward error
correction, coder and navigation modules are deployed in a unified low power
communication platform. These modules have been selected since they are among
those with the highest demand in terms of processing power, strict processing time
or throughput. The modules are mainly realised within WiFi and WiMAX systems
in addition to global positioning systems (GPS). The idea behind the selection of
these modules is to investigate the possibility of designing an architecture capable
of processing various systems and dynamically reconfiguring between them. The
GPS system is a power-hungry application and, at the same time, it is not needed
all of the time. Hence, one key idea presented in this thesis is to effectively exploit
the dynamic reconfiguration capability so as to reconfigure the architecture (GPS)
when it is not needed in order to process another needed application or function
such as WiFi or WiMAX. This will allow lower energy consumption and the
optimum usage of the hardware available on the device.
This work investigates the major current coarse-grain reconfigurable architectures.
A novel multi-rate convolution encoder is then designed and realised as a
reconfigurable fabric. This demonstrates the ability to adapt the algorithms
involved to meet various requirements. A throughput of between 200 and 800
Mbps has been achieved for the rates 1/2 to 7/8, which is a great achievement for
the proposed novel architecture. A reconfigurable interleaver is designed as a
standalone fabric and on a dynamically reconfigurable processor. High throughputs
exceeding 90 Mbps are achieved for the various supported block sizes. The Reed
Solomon coder is the next challenging system to be designed into a dynamically
reconfigurable processor. A novel Galois Field multiplier is designed and
integrated into the developed Reed Solomon reconfigurable processor. As a result
of this work, throughputs of 200Mbps and 93Mbps respectively for RS encoding
and decoding are achieved. A GPS correlation module is also investigated in this
work. This is the main part of the GPS receiver responsible for continuously
tracking GPS satellites and extracting messages from them. The challenging aspect
of this part is its real-time nature and the associated critical time constraints. This
work resulted in a novel dynamically reconfigurable multi-channel GPS correlator
with up to 72 simultaneous channels.
This work is a contribution towards a global unified processing platform that is
capable of processing communication-related operations efficiently and
dynamically with minimum energy consumption
Profile-directed specialisation of custom floating-point hardware
We present a methodology for generating
floating-point arithmetic hardware
designs which are, for suitable applications, much reduced in size, while still
retaining performance and IEEE-754 compliance. Our system uses three
key parts: a profiling tool, a set of customisable
floating-point units and a
selection of system integration methods.
We use a profiling tool for
floating-point behaviour to identify arithmetic
operations where fundamental elements of IEEE-754
floating-point may be
compromised, without generating erroneous results in the common case.
In the uncommon case, we use simple detection logic to determine when
operands lie outside the range of capabilities of the optimised hardware.
Out-of-range operations are handled by a separate, fully capable,
floatingpoint
implementation, either on-chip or by returning calculations to a host
processor. We present methods of system integration to achieve this errorcorrection.
Thus the system suffers no compromise in IEEE-754 compliance,
even when the synthesised hardware would generate erroneous results.
In particular, we identify from input operands the shift amounts required
for input operand alignment and post-operation normalisation. For operations
where these are small, we synthesise hardware with reduced-size
barrel-shifters. We also propose optimisations to take advantage of other
profile-exposed behaviours, including removing the hardware required to
swap operands in a floating-point adder or subtractor, and reducing the
exponent range to fit observed values.
We present profiling results for a range of applications, including a selection
of computational science programs, Spec FP 95 benchmarks and the
FFMPEG media processing tool, indicating which would be amenable to
our method. Selected applications which demonstrate potential for optimisation
are then taken through to a hardware implementation. We show up
to a 45% decrease in hardware size for a
floating-point datapath, with a
correctable error-rate of less then 3%, even with non-profiled datasets
Implementation of a digital optical matrix-vector multiplier using a holographic look-up table and residue arithmetic
The design and implementation of a digital (numerical) optical matrix-vector multiplier are presented. The objective is to demonstrate the operation of an optical processor designed to minimize computation time in performing a practical computing application. This is done by using the large array of processing elements in a Hughes liquid crystal light valve, and relying on the residue arithmetic representation, a holographic optical memory, and position coded optical look-up tables. In the design, all operations are performed in effectively one light valve response time regardless of matrix size. The features of the design allowing fast computation include the residue arithmetic representation, the mapping approach to computation, and the holographic memory. In addition, other features of the work include a practical light valve configuration for efficient polarization control, a model for recording multiple exposures in silver halides with equal reconstruction efficiency, and using light from an optical fiber for a reference beam source in constructing the hologram. The design can be extended to implement larger matrix arrays without increasing computation time
Approximation and Optimization of an Auditory Model for Realization in VLSI Hardware
The Auditory Image Model (AIM) is a software tool set developed to functionally model the role of the ear in the human hearing process. AIM includes detailed filter equations for the major functional portions of the ear. Currently, AIM is run on a workstation and requires 10 to 100 times real-time to process audio information and produce an auditory image. An all-digital approximation of the AIM which is suitable for implementation in very large scale integrated circuits is presented. This document details the mathematical models of AIM and the approximations and optimizations used to simplify the filtering and signal processing accomplished by AIM. Included are the details of an efficient multi-rate architecture designed for sub-micron VLSI technology to carry out the approximated equations. Finally, simulation results which indicate that the architecture, when implemented in 0.8µm CMOS VLSI, will sustain real- time operation on a 32 channel system are included. The same tests also indicate that the chip will be approximately 3.3 mm2, and consume approximately 18 mW. The details of a new and efficient method for computing an approximate logarithm (base two) on binary integers is also presented. The approximate logarithm algorithm is used to convert sound energy into millibels quickly and with low power. Additionally, the algorithm, is easily extended to compute an approximate logarithm in base ten which broadens the class of problems to which it may be applied
Efficient arithmetic for high speed DSP implementation on FPGAs
The author was sponsored by EnTegra Ltd, a company who develop hardware and software products and services for the real time implementation of DSP and RF systems.
The field programmable gate array (FPGA) is being used increasingly in the field of DSP. This is due to the fact that the parallel computing power of such devices is ideal for today’s truly demanding DSP algorithms. Algorithms such as the QR-RLS update are computationally intensive and must be carried out at extremely high speeds (MHz). This means that the DSP processor is simply not an option. ASICs can be used but the expense of developing custom logic is prohibitive.
The increased use of the FPGA in DSP means that there is a significant requirement for efficient arithmetic cores that utilises the resources on such devices. This thesis presents the research and development effort that was carried out to produce fixed point division and square root cores for use in a new Electronic Design Automation (EDA) tool for EnTegra, which is targeted at FPGA implementation of DSP systems. Further to this, a new technique for predicting the accuracy of CORDIC systems computing vector magnitudes and cosines/sines is presented. This work allows the most efficient CORDIC design for a specified level of accuracy to be found quickly and easily without the need to run lengthy simulations, as was the case before. The CORDIC algorithm is a technique using mainly shifts and additions to compute many arithmetic functions and is thus ideal for FPGA implementation
- …