444 research outputs found
Efficient Radio Resource Allocation Schemes and Code Optimizations for High Speed Downlink Packet Access Transmission
An important enhancement on the Wideband Code Division Multiple Access
(WCDMA) air interface of the 3G mobile communications, High Speed Downlink
Packet Access (HSDPA) standard has been launched to realize higher spectral
utilization efficiency. It introduces the features of multicode CDMA transmission
and Adaptive Modulation and Coding (AMC) technique, which makes radio resource
allocation feasible and essential. This thesis studies channel-aware resource
allocation schemes, coupled with fast power adjustment and spreading code optimization
techniques, for the HSDPA standard operating over frequency selective
channel.
A two-group resource allocation scheme is developed in order to achieve a
promising balance between performance enhancement and time efficiency. It only
requires calculating two parameters to specify the allocations of discrete bit rates
and transmitted symbol energies in all channels. The thesis develops the calculation
methods of the two parameters for interference-free and interference-present
channels, respectively. For the interference-present channels, the performance of
two-group allocation can be further enhanced by applying a clustering-based channel
removal scheme.
In order to make the two-group approach more time-efficient, reduction in
matrix inversions in optimum energy calculation is then discussed. When the
Minimum Mean Square Error (MMSE) equalizer is applied, optimum energy allocation
can be calculated by iterating a set of eigenvalues and eigenvectors. By
using the MMSE Successive Interference Cancellation (SIC) receiver, the optimum
energies are calculated recursively combined with an optimum channel ordering
scheme for enhancement in both system performance and time efficiency.
This thesis then studies the signature optimization methods with multipath
channel and examines their system performances when combined with different
resource allocation methods. Two multipath-aware signature optimization methods
are developed by applying iterative optimization techniques, for the system
using MMSE equalizer and MMSE precoder respectively. A PAM system using
complex signature sequences is also examined for improving resource utilization
efficiency, where two receiving schemes are proposed to fully take advantage of
PAM features. In addition by applying a short chip sampling window, a Singular
Value Decomposition (SVD) based interference-free signature design method is
presented
Run-Time Reconfigurable FFT Engine
This paper develops a system level architecture for implementing a cost-efficient, FPGA-based realtime FFT engine. This approach considers both the hardware cost (in terms of FPGA resource requirements), and performance (in terms of throughput). These two dimensions are optimized based on using run time reconfiguration, double buffering technique and the hardware virtualization to reuse the available processing components. The system employs sixteen reconfigurable parallel FFT cores. Each core represents a 16 complex point parallel FFT processor, running in continuous realtime FFT engine. The architecture support transform length of 256 complex points, as a demonstrator to the idea design, using fixed-point arithmetic and has been developed using radix-4 architecture. The parallel Booth technique for realizing the complex multiplier (required in the basic butterfly operation) is chosen. That is to save a lot of hardware compared to other techniques. The simulation results that have been performed using VHDL modeling language and ModelSim software
shows that the full design can be implemented using single FPGA platform requiring about 50,000 Slices
Fast Fourier transforms on energy-efficient application-specific processors
Many of the current applications used in battery powered devices are from digital signal processing, telecommunication, and multimedia domains. Traditionally application-specific fixed-function circuits have been used in these designs in form of application-specific integrated circuits (ASIC) to reach the required performance and energy-efficiency. The complexity of these applications has increased over the years, thus the design complexity has increased even faster, which implies increased design time. At the same time, there are more and more standards to be supported, thus using optimised fixed-function implementations for all the functions in all the standards is impractical. The non-recurring engineering costs for integrated circuits have also increased significantly, so manufacturers can only afford fewer chip iterations. Although tailoring the circuit for a specific application provides the best performance and/or energy-efficiency, such approach lacks flexibility. E.g., if an error is found after the manufacturing, an expensive chip iteration is required. In addition, new functionalities cannot be added afterwards to support evolution of standards.
Flexibility can be obtained with software based implementation technologies. Unfortunately, general-purpose processors do not provide the energy-efficiency of the fixed-function circuit designs. A useful trade-off between flexibility and performance is implementation based on application-specific processors (ASP) where programmability provides the flexibility and computational resources customised for the given application provide the performance.
In this Thesis, application-specific processors are considered by using fast Fourier transform as the representative algorithm. The architectural template used here is transport triggered architecture (TTA) which resembles very long instruction word machines but the operand execution resembles data flow machines rather than traditional operand triggering. The developed TTA processors exploit inherent parallelism of the application. In addition, several characteristics of the application have been identified and those are exploited by developing customised functional units for speeding up the execution. Several customisations are proposed for the data path of the processor but it is also important to match the memory bandwidth to the computation speed. This calls for a memory organisation supporting parallel memory accesses. The proposed optimisations have been used to improve the energy-efficiency of the processor and experiments show that a programmable solution can have energy-efficiency comparable to fixed-function ASIC designs
Modular quantum signal processing in many variables
Despite significant advances in quantum algorithms, quantum programs in
practice are often expressed at the circuit level, forgoing helpful structural
abstractions common to their classical counterparts. Consequently, as many
quantum algorithms have been unified with the advent of quantum signal
processing (QSP) and quantum singular value transformation (QSVT), an
opportunity has appeared to cast these algorithms as modules that can be
combined to constitute complex programs. Complicating this, however, is that
while QSP/QSVT are often described by the polynomial transforms they apply to
the singular values of large linear operators, and the algebraic manipulation
of polynomials is simple, the QSP/QSVT protocols realizing analogous
manipulations of their embedded polynomials are non-obvious. Here we provide a
theory of modular multi-input-output QSP-based superoperators, the basic unit
of which we call a gadget, and show they can be snapped together with LEGO-like
ease at the level of the functions they apply. To demonstrate this ease, we
also provide a Python package for assembling gadgets and compiling them to
circuits. Viewed alternately, gadgets both enable the efficient block encoding
of large families of useful multivariable functions, and substantiate a
functional-programming approach to quantum algorithm design in recasting QSP
and QSVT as monadic types.Comment: 15 pages + 9 figures + 4 tables + 45 pages supplement. For codebase,
see https://github.com/ichuang/pyqsp/tree/bet
Parallelization of dynamic programming recurrences in computational biology
The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms
An Outlook on Design Technologies for Future Integrated Systems
The economic and social demand for ubiquitous and multifaceted electronic systems-in combination with the unprecedented opportunities provided by the integration of various manufacturing technologies-is paving the way to a new class of heterogeneous integrated systems, with increased performance and connectedness and providing us with gateways to the living world. This paper surveys design requirements and solutions for heterogeneous systems and addresses design technologies for realizing them
Generalized filtering configurations with applications in digital and optical signal and image processing
Ankara : Department of Electrical and Electonics Engineering and Institute of Engineering and Sciences, Bilkent Univ., 1999.Thesis (Ph.D.) -- Bilkent University, 1999.Includes bibliographical refences.In this thesis, we first give a brief summary of the fractional Fourier transform which
is the generalization of the ordinary Fourier transform, discuss its importance in
optical and digital signal processing and its relation to time-frequency representations.
We then introduce the concept of filtering circuits in fractional Fourier domains.
This concept unifies the multi-stage (repeated) and multi-channel (parallel) filtering
configurations which are in turn generalizations of single domain filtering in fractional
Fourier domains. We show that these filtering configurations allow a cost-accuracy tradeoff
by adjusting the number of stages or channels. We then consider the application
of these configurations to three important problems, namely system synthesis, signal
synthesis, and signal recovery, in optical and digital signal processing. In the system
and signal synthesis problems, we try to synthesize a desired system characterized by its
kernel, or a desired signal characterized by its second order statistics by using fractional
Fourier domain filtering circuits. In the signal recovery problem, we try to recover or
estimate a desired signal from its degraded version. In all of the examples we give,
significant improvements in performance are obtained with respect to single domain
filtering methods with only modest increases in optical or digital implementation costs.
Similarly, when the proposed method is compared with the direct implementation of
general linear systems, we see that significant computational savings are obtained with
acceptable decreases in performance.Kutay, Mehmet AlperPh.D
Algorithms for Scheduling Problems
This edited book presents new results in the area of algorithm development for different types of scheduling problems. In eleven chapters, algorithms for single machine problems, flow-shop and job-shop scheduling problems (including their hybrid (flexible) variants), the resource-constrained project scheduling problem, scheduling problems in complex manufacturing systems and supply chains, and workflow scheduling problems are given. The chapters address such subjects as insertion heuristics for energy-efficient scheduling, the re-scheduling of train traffic in real time, control algorithms for short-term scheduling in manufacturing systems, bi-objective optimization of tortilla production, scheduling problems with uncertain (interval) processing times, workflow scheduling for digital signal processor (DSP) clusters, and many more
- …