697 research outputs found
Generating just temperament with ideal rate multiplication
I have developed a new rate multiplication method. I call it ideal because the quality of the frequency approximation that it makes is maximized, and it extends the range of its frequency scaling factor to all the rational numbers between zero and one. In addition, each input control code produces a unique output frequency. The input code is necessarily not straight binary, so the circuit is not pin compatible with other rate multipliers. I designed the ideal rate multiplier as a method for easily producing just tempered musical intervals. Even a three bit ideal rate multiplier produces many recognizable musical intervals. I built a prototype of a seven bit ideal rate multiplier which plays eleven notes of the twelve tone scale. The design itself is academically interesting, and deals with recursion and rational numbers
Scalable Energy-Recovery Architectures.
Energy efficiency is a critical challenge for today's integrated circuits, especially for high-end digital signal processing and communications that require both high throughput and low energy dissipation for extended battery life. Charge-recovery logic recovers and reuses charge using inductive elements and has the potential to achieve order-of-magnitude improvement in energy efficiency while maintaining high performance. However, the lack of large-scale high-speed silicon demonstrations and inductor area overheads are two major concerns.
This dissertation focuses on scalable charge-recovery designs. We present a semi-automated design flow to enable the design of large-scale charge-recovery chips. We also present a new architecture that uses in-package inductors, eliminating the area overheads caused by the use of integrated inductors in high-performance charge-recovery chips.
To demonstrate our semi-automated flow, which uses custom-designed standard-cell-like dynamic cells, we have designed a 576-bit charge-recovery low-density parity-check (LDPC) decoder chip. Functioning correctly at clock speeds above 1 GHz, this prototype is the first-ever demonstration of a GHz-speed charge-recovery chip of significant complexity. In terms of energy consumption, this chip improves over recent state-of-the-art LDPCs by at least 1.3 times with comparable or better area efficiency.
To demonstrate our architecture for eliminating inductor overheads, we have designed a charge-recovery LDPC decoder chip with in-package inductors. This test-chip has been fabricated in a 65nm CMOS flip-chip process. A custom 6-layer FC-BGA package substrate has been designed with 16 inductors embedded in the fifth layer of the package substrate, yielding higher Q and significantly improving area efficiency and energy efficiency compared to their on-chip counterparts. From measurements, this chip achieves at least 2.3 times lower energy consumption with better area efficiency over state-of-the-art published designs.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116653/1/terryou_1.pd
Introduction to Logic Circuits & Logic Design with Verilog
The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over
the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits
were designed using classical techniques. Classical techniques relied heavily on manual design
practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design
style, academic textbooks were developed that taught classical digital design techniques. Around 1990,
large-scale digital systems began being designed using hardware description languages (HDL) and
automated synthesis tools. Broad-scale adoption of this modern design approach spread through the
industry during this decade. Around 2000, hardware description languages and the modern digital
design approach began to be taught in universities, mainly at the senior and graduate level. There
were a variety of reasons that the modern digital design approach did not penetrate the lower levels of
academia during this time. First, the design and simulation tools were difficult to use and overwhelmed
freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting
was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which
are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid
advances in programmable logic and design tools allowed the modern digital design approach to be
implemented in a university setting, even in lower-level courses. This allowed students to learn the
modern design approach based on HDLs and prototype their designs in real hardware, mainly fieldprogrammable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored, teaching
hardware description languages and higher levels of design abstraction. This trend has continued until
today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only
the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital
circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain
sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach
the modern digital design approach move immediately into high-level behavioral modeling with minimal
or no coverage of the underlying hardware used to implement the systems. As a result, students are not
being provided the resources to understand the fundamental hardware theory that lies beneath the
modern abstraction such as interfacing, gate-level implementation, and technology optimization.
Students moving too rapidly into high levels of abstraction have little understanding of what is going
on when they click the “compile and synthesize” button of their design tool. This leads to graduates who
can model a breadth of different systems in an HDL but have no depth into how the system is
implemented in hardware. This becomes problematic when an issue arises in a real design and there
is no foundational knowledge for the students to fall back on in order to debug the problem
Introduction to Logic Circuits & Logic Design with VHDL
The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over
the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits
were designed using classical techniques. Classical techniques relied heavily on manual design
practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design
style, academic textbooks were developed that taught classical digital design techniques. Around 1990,
large-scale digital systems began being designed using hardware description languages (HDL) and
automated synthesis tools. Broad-scale adoption of this modern design approach spread through the
industry during this decade. Around 2000, hardware description languages and the modern digital
design approach began to be taught in universities, mainly at the senior and graduate level. There
were a variety of reasons that the modern digital design approach did not penetrate the lower levels of
academia during this time. First, the design and simulation tools were difficult to use and overwhelmed
freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting
was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which
are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid
advances in programmable logic and design tools allowed the modern digital design approach to be
implemented in a university setting, even in lower-level courses. This allowed students to learn the
modern design approach based on HDLs and prototype their designs in real hardware, mainly field
programmable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored teaching
hardware description languages and higher levels of design abstraction. This trend has continued until
today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only
the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital
circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain
sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach
the modern digital design approach move immediately into high-level behavioral modeling with minimal
or no coverage of the underlying hardware used to implement the systems. As a result, students are not
being provided the resources to understand the fundamental hardware theory that lies beneath the
modern abstraction such as interfacing, gate-level implementation, and technology optimization.
Students moving too rapidly into high levels of abstraction have little understanding of what is going
on when they click the “compile and synthesize” button of their design tool. This leads to graduates who
can model a breadth of different systems in an HDL but have no depth into how the system is
implemented in hardware. This becomes problematic when an issue arises in a real design and there
is no foundational knowledge for the students to fall back on in order to debug the problem
Design and FPGA Implementation of CORDIC-based 8-point 1D DCT Processor
CORDIC or CO-ordinate Rotation DIgital Computer is a fast, simple, efficient and powerful algorithm used for diverse Digital Signal Processing applications. Primarily developed for real-time airborne computations, it uses a unique computing technique which is especially suitable for solving the trigonometric relationships involved in plane co-ordinate rotation and conversion from rectangular to polar form. It comprises a special serial arithmetic unit having three shift registers, three adders/subtractors, Look-Up table and special interconnections. Using a prescribed sequence of conditional additions or subtractions the CORDIC arithmetic unit can be controlled to solve either of the following equations:
Y’=K (Ycos λ+ Xsin λ)
X’=K (Xcos λ - Ysin λ); where K is a constant
In this project:
• A CORDIC-based processor for sine/cosine calculation was designed using VHDL programming in Xilinx ISE 10.1. The CORDIC module was tested for its functionality and correctness by test-bench analysis. Subsequently, FPGA implementation of the CORDIC core followed by ChipScopePro analysis of the output logic waveforms was performed.
• Using this CORDIC core a DCT processor was designed to calculate the 8-point 1D DCT. The functionality and operational correctness of this processor was tested, first on the test-bench and then via ChipScopePro analysis, post FPGA implementation.
The output obtained in both the cases was compared with the actual values to test for consistency and the percentage of accuracy was established. Power consumption and FPGA resource utilization were observed. The results obtained were discussed
Hybrid receiver study
The results are presented of a 4 month study to design a hybrid analog/digital receiver for outer planet mission probe communication links. The scope of this study includes functional design of the receiver; comparisons between analog and digital processing; hardware tradeoffs for key components including frequency generators, A/D converters, and digital processors; development and simulation of the processing algorithms for acquisition, tracking, and demodulation; and detailed design of the receiver in order to determine its size, weight, power, reliability, and radiation hardness. In addition, an evaluation was made of the receiver's capabilities to perform accurate measurement of signal strength and frequency for radio science missions
FPGA Implementation of Fast Binary Multiplication Based on Customized Basic Cells
Multiplication is considered one of the most time-consuming and a key operation in wide variety of embedded applications. Speeding up this operation has a significant impact on the overall performance of these applications. A vast number of multiplication approaches are found in the literature where the goal is always to achieve a higher performance. One of these approaches relies on using smaller multiplier blocks which are built based on direct Boolean algebra equations to build large multipliers. In this work, we present a methodology for designing binary multipliers where different sizes customized partial products generation (CPPG) cells are designed and used as smaller building blocks. The sizes of the designed CPPG cells are 2×2, 3×3, 4×4, 5×5, and 6×6. We use these cells to build 8×8, 16×16, 32×32, 64×64, and 128×128 binary multipliers. All of the CPPG cells and the binary multipliers are described using the VHDL language, tested, and implemented using XILINX ISE 14.6 tools targeting different FPGA families. The implementation results show that the best performance is achieved when cell 3×3 is used and Virtex-7 FPGA is targeted. The binary multipliers that are designed using the proposed CPPG cells achieve better performance when compared with the binary multipliers presented in the literature. As an application that utilizes the proposed multiplier, a Multiply-Accumulate (MAC) unit is designed and implemented in Spartan-3E. The implementation results of the MAC unit demonstrate the effectiveness of the proposed multiplier
Recommended from our members
Efficient FPGA implementation and power modelling of image and signal processing IP cores
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage
and signal processing application areas such as consumer electronics, instrumentation,
medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA
devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the
work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of
cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area.
A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM
is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed
Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency
As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts.
Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis
Computing moments of a binary horizontally/vertically convex image using run-time reconfiguration
In this thesis, we present a design for computing moments of a binary horizontally/vertically convex image on an FPGA chip, using run-time reconfiguration. We compute the moments of up to third order for a total of 16 moments. We address how run-time reconfiguration speeds up moment computations without taking up huge hardware resources. Since we are considering a binary horizontally/vertically convex image, we look at an alternative method in moment computations that utilizes constant coefficient multipliers. We divide the image into segments and process one segment at a time. We reconfigure the constant coefficient multipliers before processing the next segment. This thesis also looks at the interactions between different logic units for moment computations. We provide an estimate of the total number of CLBs used to implement this design on an FPGA chip. Finally, we address variations of this particular type of image, such as non-binary and non-convex and determine whether this design is still applicable in those instances
- …