444 research outputs found

    Efficient Radio Resource Allocation Schemes and Code Optimizations for High Speed Downlink Packet Access Transmission

    No full text
    An important enhancement on the Wideband Code Division Multiple Access (WCDMA) air interface of the 3G mobile communications, High Speed Downlink Packet Access (HSDPA) standard has been launched to realize higher spectral utilization efficiency. It introduces the features of multicode CDMA transmission and Adaptive Modulation and Coding (AMC) technique, which makes radio resource allocation feasible and essential. This thesis studies channel-aware resource allocation schemes, coupled with fast power adjustment and spreading code optimization techniques, for the HSDPA standard operating over frequency selective channel. A two-group resource allocation scheme is developed in order to achieve a promising balance between performance enhancement and time efficiency. It only requires calculating two parameters to specify the allocations of discrete bit rates and transmitted symbol energies in all channels. The thesis develops the calculation methods of the two parameters for interference-free and interference-present channels, respectively. For the interference-present channels, the performance of two-group allocation can be further enhanced by applying a clustering-based channel removal scheme. In order to make the two-group approach more time-efficient, reduction in matrix inversions in optimum energy calculation is then discussed. When the Minimum Mean Square Error (MMSE) equalizer is applied, optimum energy allocation can be calculated by iterating a set of eigenvalues and eigenvectors. By using the MMSE Successive Interference Cancellation (SIC) receiver, the optimum energies are calculated recursively combined with an optimum channel ordering scheme for enhancement in both system performance and time efficiency. This thesis then studies the signature optimization methods with multipath channel and examines their system performances when combined with different resource allocation methods. Two multipath-aware signature optimization methods are developed by applying iterative optimization techniques, for the system using MMSE equalizer and MMSE precoder respectively. A PAM system using complex signature sequences is also examined for improving resource utilization efficiency, where two receiving schemes are proposed to fully take advantage of PAM features. In addition by applying a short chip sampling window, a Singular Value Decomposition (SVD) based interference-free signature design method is presented

    Run-Time Reconfigurable FFT Engine

    Get PDF
    This paper develops a system level architecture for implementing a cost-efficient, FPGA-based realtime FFT engine. This approach considers both the hardware cost (in terms of FPGA resource requirements), and performance (in terms of throughput). These two dimensions are optimized based on using run time reconfiguration, double buffering technique and the hardware virtualization to reuse the available processing components. The system employs sixteen reconfigurable parallel FFT cores. Each core represents a 16 complex point parallel FFT processor, running in continuous realtime FFT engine. The architecture support transform length of 256 complex points, as a demonstrator to the idea design, using fixed-point arithmetic and has been developed using radix-4 architecture. The parallel Booth technique for realizing the complex multiplier (required in the basic butterfly operation) is chosen. That is to save a lot of hardware compared to other techniques. The simulation results that have been performed using VHDL modeling language and ModelSim software shows that the full design can be implemented using single FPGA platform requiring about 50,000 Slices

    Fast Fourier transforms on energy-efficient application-specific processors

    Get PDF
    Many of the current applications used in battery powered devices are from digital signal processing, telecommunication, and multimedia domains. Traditionally application-specific fixed-function circuits have been used in these designs in form of application-specific integrated circuits (ASIC) to reach the required performance and energy-efficiency. The complexity of these applications has increased over the years, thus the design complexity has increased even faster, which implies increased design time. At the same time, there are more and more standards to be supported, thus using optimised fixed-function implementations for all the functions in all the standards is impractical. The non-recurring engineering costs for integrated circuits have also increased significantly, so manufacturers can only afford fewer chip iterations. Although tailoring the circuit for a specific application provides the best performance and/or energy-efficiency, such approach lacks flexibility. E.g., if an error is found after the manufacturing, an expensive chip iteration is required. In addition, new functionalities cannot be added afterwards to support evolution of standards. Flexibility can be obtained with software based implementation technologies. Unfortunately, general-purpose processors do not provide the energy-efficiency of the fixed-function circuit designs. A useful trade-off between flexibility and performance is implementation based on application-specific processors (ASP) where programmability provides the flexibility and computational resources customised for the given application provide the performance. In this Thesis, application-specific processors are considered by using fast Fourier transform as the representative algorithm. The architectural template used here is transport triggered architecture (TTA) which resembles very long instruction word machines but the operand execution resembles data flow machines rather than traditional operand triggering. The developed TTA processors exploit inherent parallelism of the application. In addition, several characteristics of the application have been identified and those are exploited by developing customised functional units for speeding up the execution. Several customisations are proposed for the data path of the processor but it is also important to match the memory bandwidth to the computation speed. This calls for a memory organisation supporting parallel memory accesses. The proposed optimisations have been used to improve the energy-efficiency of the processor and experiments show that a programmable solution can have energy-efficiency comparable to fixed-function ASIC designs

    Modular quantum signal processing in many variables

    Full text link
    Despite significant advances in quantum algorithms, quantum programs in practice are often expressed at the circuit level, forgoing helpful structural abstractions common to their classical counterparts. Consequently, as many quantum algorithms have been unified with the advent of quantum signal processing (QSP) and quantum singular value transformation (QSVT), an opportunity has appeared to cast these algorithms as modules that can be combined to constitute complex programs. Complicating this, however, is that while QSP/QSVT are often described by the polynomial transforms they apply to the singular values of large linear operators, and the algebraic manipulation of polynomials is simple, the QSP/QSVT protocols realizing analogous manipulations of their embedded polynomials are non-obvious. Here we provide a theory of modular multi-input-output QSP-based superoperators, the basic unit of which we call a gadget, and show they can be snapped together with LEGO-like ease at the level of the functions they apply. To demonstrate this ease, we also provide a Python package for assembling gadgets and compiling them to circuits. Viewed alternately, gadgets both enable the efficient block encoding of large families of useful multivariable functions, and substantiate a functional-programming approach to quantum algorithm design in recasting QSP and QSVT as monadic types.Comment: 15 pages + 9 figures + 4 tables + 45 pages supplement. For codebase, see https://github.com/ichuang/pyqsp/tree/bet

    Parallelization of dynamic programming recurrences in computational biology

    Get PDF
    The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms

    An Outlook on Design Technologies for Future Integrated Systems

    Get PDF
    The economic and social demand for ubiquitous and multifaceted electronic systems-in combination with the unprecedented opportunities provided by the integration of various manufacturing technologies-is paving the way to a new class of heterogeneous integrated systems, with increased performance and connectedness and providing us with gateways to the living world. This paper surveys design requirements and solutions for heterogeneous systems and addresses design technologies for realizing them

    Generalized filtering configurations with applications in digital and optical signal and image processing

    Get PDF
    Ankara : Department of Electrical and Electonics Engineering and Institute of Engineering and Sciences, Bilkent Univ., 1999.Thesis (Ph.D.) -- Bilkent University, 1999.Includes bibliographical refences.In this thesis, we first give a brief summary of the fractional Fourier transform which is the generalization of the ordinary Fourier transform, discuss its importance in optical and digital signal processing and its relation to time-frequency representations. We then introduce the concept of filtering circuits in fractional Fourier domains. This concept unifies the multi-stage (repeated) and multi-channel (parallel) filtering configurations which are in turn generalizations of single domain filtering in fractional Fourier domains. We show that these filtering configurations allow a cost-accuracy tradeoff by adjusting the number of stages or channels. We then consider the application of these configurations to three important problems, namely system synthesis, signal synthesis, and signal recovery, in optical and digital signal processing. In the system and signal synthesis problems, we try to synthesize a desired system characterized by its kernel, or a desired signal characterized by its second order statistics by using fractional Fourier domain filtering circuits. In the signal recovery problem, we try to recover or estimate a desired signal from its degraded version. In all of the examples we give, significant improvements in performance are obtained with respect to single domain filtering methods with only modest increases in optical or digital implementation costs. Similarly, when the proposed method is compared with the direct implementation of general linear systems, we see that significant computational savings are obtained with acceptable decreases in performance.Kutay, Mehmet AlperPh.D

    Evolving Networks To Have Intelligence Realized At Nanoscale

    Get PDF

    Algorithms for Scheduling Problems

    Get PDF
    This edited book presents new results in the area of algorithm development for different types of scheduling problems. In eleven chapters, algorithms for single machine problems, flow-shop and job-shop scheduling problems (including their hybrid (flexible) variants), the resource-constrained project scheduling problem, scheduling problems in complex manufacturing systems and supply chains, and workflow scheduling problems are given. The chapters address such subjects as insertion heuristics for energy-efficient scheduling, the re-scheduling of train traffic in real time, control algorithms for short-term scheduling in manufacturing systems, bi-objective optimization of tortilla production, scheduling problems with uncertain (interval) processing times, workflow scheduling for digital signal processor (DSP) clusters, and many more
    corecore