1,192 research outputs found

    Analysis of Hardware Sorting Units in Processor Design

    Get PDF
    Sorting is often computationally intensive and can cause the application in which it is used to run slowly. To date, the quickest software sorting implementations for an N element sorting problem runs at O(nlogn). Current techniques, beyond developing better algorithms, used to accelerate sorting include the use of multiple processors or moving the sorting operation to a GPU. The use of multiple processors or a GPU can lead to increased energy consumption and heat produced by the device as compared to a single-core GPU-less implementation. To address these problems, specialized instructions and hardware units can be added to the processors to accelerate the sorting operation directly. This thesis studies and records the performance implications from implementing a sorting accelerator into a modern RISC-V processor pipeline. This thesis also explores the additional energy and area costs of implementing such hardware units in the processor

    Efficient Real-Time Architectures and FPGA Implementations of Histogram-Based Median Filters for High Definition Videos

    Get PDF
    Digital filtering plays an important role in many signal processing applications. Filtering is performed to recover the original signal from its corrupted version. Median filter is a non-linear digital filter that replaces a sample in a given window by the median value of the samples in the window. For images corrupted with impulse noise, median filter provides a very high quality of filtered images. Several modifications of median filters have been proposed and implemented to achieve high image quality compared to that provided by conventional median filters. When these filters are implemented on hardware platforms such as FPGAs, the performance parameters, namely, the area, power and operating frequency should be taken into consideration in addition to the quality of the filtered image. Therefore, efficient implementation of median filters on FPGAs for image and video processing algorithms has been a topic of much interest. The existing hardware-based median filters for high definition video formats do not always satisfy the real-time throughput requirements or are inefficient with respect to hardware performance parameters, such as the area and frequency. This is due to the fact that most of the existing techniques use sorting-based median calculation, which results in a low hardware performance. In this thesis, architectures that use histogram-based median computation, which is a non-sorting-based operation, are designed with a view of efficient hardware implementation. This is carried out in two parts. We design and implement efficient architectures that satisfy the real-time throughput requirements of full high definition (FHD) videos in the first part and that of ultra high definition (UHD) videos in the second part. In the first part, an efficient real-time histogram-based median filter that uses the concept of bit-plane-slicing and adaptive switching median filter (ASMF) is designed and implemented on FPGAs. We term this architecture as hybrid architecture for median filtering (HAMF). The proposed HAMF computes an approximate median, since it uses only the most significant B-bits of the pixel values for median calculation. As a result, the algorithmic level implementation of the proposed HAMF results in a slight degradation in the filtered image quality compared to that provided by ASMF. The proposed HAMF provides a significant improvement over ASMF in terms of the area and operating frequency, when implemented on different generation FPGAs. Analysis of the different parameters, such as the number of bit-planes used in the computation of the median and the number of pipelining stages, is carried out to study the trade-off between the quality of the filtered image and hardware performance. Although the FPGA implementation of the proposed HAMF provides a very high operating frequency, the quality of the images filtered by its algorithmic level implementation decreases with increasing window size and noise density. This filter may be suitable for applications that require FHD filtering with cost constraints, but not for applications where the output image quality is as important as the hardware performance. Hence, in the second part, we design an efficient and real-time architecture of the hierarchical histogram-based median filter (HHMF). The proposed architecture is designed using a full synchronous pipeline, a synchronous accumulate-and-compare unit, and is scalable. The FPGA implementation of the proposed architecture of HHMF can perform real-time filtering of 4K and 8K UHD videos. The quality of the image filtered by HHMF is not compromised as in the case of HAMF, since HHMF uses all the bit-planes and computes the actual median. Although the FPGA implementation of HHMF results in more area utilization, the proposed implementation is more economical than a GPU-based HHMF implementation and provides a better throughput

    Writer Identification Using Inexpensive Signal Processing Techniques

    Full text link
    We propose to use novel and classical audio and text signal-processing and otherwise techniques for "inexpensive" fast writer identification tasks of scanned hand-written documents "visually". The "inexpensive" refers to the efficiency of the identification process in terms of CPU cycles while preserving decent accuracy for preliminary identification. This is a comparative study of multiple algorithm combinations in a pattern recognition pipeline implemented in Java around an open-source Modular Audio Recognition Framework (MARF) that can do a lot more beyond audio. We present our preliminary experimental findings in such an identification task. We simulate "visual" identification by "looking" at the hand-written document as a whole rather than trying to extract fine-grained features out of it prior classification.Comment: 9 pages; 1 figure; presented at CISSE'09 at http://conference.cisse2009.org/proceedings.aspx ; includes the the application source code; based on MARF described in arXiv:0905.123

    A pipelined configurable gate array for embedded processors

    Get PDF
    In recent years the challenge of high performance, low power retargettable embedded system has been faced with different technological and architectural solutions. In this paper we present a new configurable unit explicitly designed to imple-ment additional reconfigurable pipelined datapaths, suitable for the design of reconfigurable processors. A VLIW recon-figurable processor has been implemented on silicon in a standard 0.18 µm CMOS technology to prove the effective-ness of the proposed unit. Testing on a signal processing algorithms benchmark showed speedups from 4.3x to 13.5x and energy consumption reduction up to 92%

    Dynamically Reconfigurable Architectures and Systems for Time-varying Image Constraints (DRASTIC) for Image and Video Compression

    Get PDF
    In the current information booming era, image and video consumption is ubiquitous. The associated image and video coding operations require significant computing resources for both small-scale computing systems as well as over larger network systems. For different scenarios, power, bitrate and image quality can impose significant time-varying constraints. For example, mobile devices (e.g., phones, tablets, laptops, UAVs) come with significant constraints on energy and power. Similarly, computer networks provide time-varying bandwidth that can depend on signal strength (e.g., wireless networks) or network traffic conditions. Alternatively, the users can impose different constraints on image quality based on their interests. Traditional image and video coding systems have focused on rate-distortion optimization. More recently, distortion measures (e.g., PSNR) are being replaced by more sophisticated image quality metrics. However, these systems are based on fixed hardware configurations that provide limited options over power consumption. The use of dynamic partial reconfiguration with Field Programmable Gate Arrays (FPGAs) provides an opportunity to effectively control dynamic power consumption by jointly considering software-hardware configurations. This dissertation extends traditional rate-distortion optimization to rate-quality-power/energy optimization and demonstrates a wide variety of applications in both image and video compression. In each application, a family of Pareto-optimal configurations are developed that allow fine control in the rate-quality-power/energy optimization space. The term Dynamically Reconfiguration Architecture Systems for Time-varying Image Constraints (DRASTIC) is used to describe the derived systems. DRASTIC covers both software-only as well as software-hardware configurations to achieve fine optimization over a set of general modes that include: (i) maximum image quality, (ii) minimum dynamic power/energy, (iii) minimum bitrate, and (iv) typical mode over a set of opposing constraints to guarantee satisfactory performance. In joint software-hardware configurations, DRASTIC provides an effective approach for dynamic power optimization. For software configurations, DRASTIC provides an effective method for energy consumption optimization by controlling processing times. The dissertation provides several applications. First, stochastic methods are given for computing quantization tables that are optimal in the rate-quality space and demonstrated on standard JPEG compression. Second, a DRASTIC implementation of the DCT is used to demonstrate the effectiveness of the approach on motion JPEG. Third, a reconfigurable deblocking filter system is investigated for use in the current H.264/AVC systems. Fourth, the dissertation develops DRASTIC for all 35 intra-prediction modes as well as intra-encoding for the emerging High Efficiency Video Coding standard (HEVC)
    • …
    corecore