576 research outputs found

    FPGA BASED PARALLEL IMPLEMENTATION OF STACKED ERROR DIFFUSION ALGORITHM

    Get PDF
    Digital halftoning is a crucial technique used in digital printers to convert a continuoustone image into a pattern of black and white dots. Halftoning is used since printers have a limited availability of inks and cannot reproduce all the color intensities in a continuous image. Error Diffusion is an algorithm in halftoning that iteratively quantizes pixels in a neighborhood dependent fashion. This thesis focuses on the development and design of a parallel scalable hardware architecture for high performance implementation of a high quality Stacked Error Diffusion algorithm. The algorithm is described in ‘C’ and requires a significant processing time when implemented on a conventional CPU. Thus, a new hardware processor architecture is developed to implement the algorithm and is implemented to and tested on a Xilinx Virtex 5 FPGA chip. There is an extraordinary decrease in the run time of the algorithm when run on the newly proposed parallel architecture implemented to FPGA technology compared to execution on a single CPU. The new parallel architecture is described using the Verilog Hardware Description Language. Post-synthesis and post-implementation, performance based Hardware Description Language (HDL), simulation validation of the new parallel architecture is achieved via use of the ModelSim CAD simulation tool

    ASIC implemented MicroBlaze-based Coprocessor for Data Stream Management Systems

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)The drastic increase in Internet usage demands the need for processing data in real time with higher efficiency than ever before. Symbiote Coprocessor Unit (SCU), developed by Dr. Pranav Vaidya, is a hardware accelerator which has potential of providing data processing speedup of up to 150x compared with traditional data stream processors. However, SCU implementation is very complex, fixed, and uses an outdated host interface, which limits future improvement. Mr. Tareq S. Alqaisi, an MSECE graduate from IUPUI worked on curbing these limitations. In his architecture, he used a Xilinx MicroBlaze microcontroller to reduce the complexity of SCU along with few other modifications. The objective of this study is to make SCU suitable for mass production while reducing its power consumption and delay. To accomplish this, the execution unit of SCU has been implemented in application specific integrated circuit and modules such as ACG/OCG, sequential comparator, and D-word multiplier/divider are integrated into the design. Furthermore, techniques such as operand isolation, buffer insertion, cell swapping, and cell resizing are also integrated into the system. As a result, the new design attains 67.9435 µW of dynamic power as compared to 74.0012 µW before power optimization along with a small increase in static power, 39.47 ns of clock period as opposed to 52.26 ns before time optimization

    Test Quality Analysis and Improvement for an Embedded Asynchronous FIFO

    Full text link
    Embedded First-InFirst-Out (FIFO) memories are increasingly used in many IC designs.We have created a new full-custom embedded FIFO module withasynchronous read and write clocks, which is at least a factor twosmaller and also faster than SRAM-based and standard-cell-basedcounterparts. The detection qualities of the FIFO test for bothhard and weak resistive shorts and opens have been analyzed by anIFA-like method based on analog simulation. The defect coverage ofthe initial FIFO test for shorts in the bit-cell matrix has beenimproved by inclusion of an additional data background andlow-voltage testing; for low-resistant shorts, 100% defect coverageis obtained. The defect coverage for opens has been improved by anew test procedure which includes waitingperiods

    Computing with Spintronics: Circuits and architectures

    Get PDF
    This thesis makes the following contributions towards the design of computing platforms with spintronic devices. 1) It explores the use of spintronic memories in the design of a domain-specific processor for an emerging class of data-intensive applications, namely recognition, mining and synthesis (RMS). Two different spintronic memory technologies — Domain Wall Memory (DWM) and STT-MRAM — are utilized to realize the different levels in the memory hierarchy of the domain-specific processor, based on their respective access characteristics. Architectural tradeoffs created by the use of spintronic memories are analyzed. The proposed design achieves 1.5X-4X improvements in energy-delay product compared to a CMOS baseline. 2) It describes the first attempt to use DWM in the cache hierarchy of general-purpose processors. DWM promises unparalleled density by packing several bits of data into each bit-cell. TapeCache, the proposed DWM-based cache architecture, utilizes suitable circuit and architectural optimizations to address two key challenges (i) the high energy and latency requirement of write operations and (ii) the need for shift operations to access the data stored in each DWM bit-cell. At the circuit level, DWM bit-cells that are tailored to the distinct design requirements of different levels in the cache hierarchy are proposed. At the architecture level, TapeCache proposes suitable cache organization and management policies to alleviate the performance impact of shift operations required to access data stored in DWM bit-cells. TapeCache achieves more than 7X improvements in both cache area and energy with virtually identical performance compared to an SRAM-based cache hierarchy. 3) It investigates the design of the on-chip memory hierarchy of general-purpose graphics processing units (GPGPUs)—massively parallel processors that are optimized for data-intensive high-throughput workloads—using DWM. STAG, a high density, energy-efficient Spintronic- Tape Architecture for GPGPU cache hierarchies is described. STAG utilizes different DWM bit-cells to realize different memory arrays in the GPGPU cache hierarchy. To address the challenge of high access latencies due to shifts, STAG predicts upcoming cache accesses by leveraging unique characteristics of GPGPU architectures and workloads, and prefetches data that are both likely to be accessed and require large numbers of shift operations. STAG achieves 3.3X energy reduction and 12.1% performance improvement over CMOS SRAM under iso-area conditions. 4) While the potential of spintronic devices for memories is widely recognized, their utility in realizing logic is much less clear. The thesis presents Spintastic, a new paradigm that utilizes Stochastic Computing (SC) to realize spintronic logic. In SC, data is encoded in the form of pseudo-random bitstreams, such that the probability of a \u271\u27 in a bitstream corresponds to the numerical value that it represents. SC can enable compact, low-complexity logic implementations of various arithmetic functions. Spintastic establishes the synergy between stochastic computing and spin-based logic by demonstrating that they mutually alleviate each other\u27s limitations. On the one hand, various building blocks of SC, which incur significant overheads in CMOS implementations, can be efficiently realized by exploiting the physical characteristics of spin devices. On the other hand, the reduced logic complexity and low logic depth of SC circuits alleviates the shortcomings of spintronic logic. Based on this insight, the design of spin-based stochastic arithmetic circuits, bitstream generators, bitstream permuters and stochastic-to-binary converter circuits are presented. Spintastic achieves 7.1X energy reduction over CMOS implementations for a wide range of benchmarks from the image processing, signal processing, and RMS application domains. 5) In order to evaluate the proposed spintronic designs, the thesis describes various device-to-architecture modeling frameworks. Starting with devices models that are calibrated to measurements, the characteristics of spintronic devices are successively abstracted into circuit-level and architectural models, which are incorporated into suitable simulation frameworks. (Abstract shortened by UMI.

    A High Frequency Photoacoustic System for Colorectal Cancer Imaging

    Get PDF
    While colorectal cancer is the second largest cause of cancer-related deaths in the United States, early detection is a key factor in its survival rate. Compared to conventional imaging modalities, photoacoustic imaging offers benefits in providing angiographic images which are valuable for early-stage tumor detection. This thesis presents the design of a 32-channel 80 MHz photoacoustic image system, whose relatively high frequency offers particular advantages. The system comprises several modules, including a laser system, ultrasound probe, AD convertor, microcontroller (FPGA), and a computer. The system requires programs for the FPGA and the data receiver on the computer. The data transportation accuracy, signal-noise ratio, and transmission speed are analyzed here to understand the system’s performance. The system was tested by standard phantoms, like carbon fiber and black tape, and images were reconstructed by the typical delay-and-sum algorithm. As important parameters of this system, the spatial resolution and signal-to-noise ratio were analyzed. In the future, to improve the lateral resolution of this system and broaden its imaging window, we can expand the 32-channel system up to 256 channels simply by duplicating the 32-channel data acquisition module. This improved system can provide detailed ex-vivo and in-vivo information about colorectal cancer

    FPGA-BASED IMPLEMENTATION OF DUAL-FREQUENCY PATTERN SCHEME FOR 3-D SHAPE MEASUREMENT

    Get PDF
    Structured Light Illumination (SLI) is the process where spatially varied patterns are projected onto a 3-D surface and based on the distortion by the surface topology, phase information can be calculated and a 3D model constructed. Phase Measuring Profilometry (PMP) is a particular type of SLI that requires three or more patterns temporarily multiplexed. High speed PMP attempts to scan moving objects whose motion is small so as to have little impact on the 3-D model. Given that practically all machine vision cameras and high speed cameras employ a Field Programmable Gate Array (FPGA) interface directly to the image sensors, the opportunity exists to do the processing on camera. This thesis focuses on the design, implementation, testing, and evaluation of a camera-projector system to implement a PMP dual-frequency scheme for 3-D shape measurement on a single FPGA chip. The processor architecture is implemented and tested using the Xilinx Spartan 3 FPGA chip on an Opal Kelly development board. The hardware is described using VHDL and Verilog Hardware Description Languages (HDLs)

    Design of a distributed data acquisition system for the ITER’s neutral beam

    Get PDF
    The International Thermonuclear Experimental Reactor (ITER) is a groundbreaking interna- tional collaboration aimed at developing fusion energy as a clean, safe, and virtually limitless source of power that brings together scientists, engineers, and experts from 35 countries to con- struct and operate the world’s largest experimental fusion reactor. Through the fusion of hy- drogen isotopes, ITER seeks to replicate the process that powers the sun and stars, harnessing the immense energy released to generate electricity. With its ambitious goals and cutting-edge technology, ITER represents a significant milestone in the pursuit of sustainable and abundant energy for the future. As part of the ITER project, the development of several systems of plasma heating is needed to achieve fusion conditions in order to reach plasma ignition. One of such heating systems is the Heating Neutral Beam (HNB), which is designed to inject a energetic beam of neutral atoms into the plasma and heat the fusion plasma by coulomb collisions of such with the plasma. This system requires of several components such as power supplies, cryopumps and cooling components working together in order to achieve a controlled and safe operation of the HNB. It also needs to work coordinated with the experimental control with high availability. The neutral beam control system is, therefore, responsible for the correct and safe operation of the two HNB units installed at ITER. The project presents an overview of the instrumentation and control system currently being developed for the Neutral Beam units and presents the development and design of a remote distributed data acquisition system prototype for the Neutral Beam instrumentation and control system. The performance of the prototype will be measured and evaluated to determine if such solution is fit for ITER requirements and can therefore be implemented into the Neutral Beam control system and other control systems within the reactor components. This project was developed under the Traineeship program by the European Joint Undertaking for ITER and the Development of Fusion Energy, Fusion For Energy (F4E). This report presents the work the author performed during such contract and under the guidance of the program’s supervisor
    • …
    corecore