96,510 research outputs found
Efficient Detectors for MIMO-OFDM Systems under Spatial Correlation Antenna Arrays
This work analyzes the performance of the implementable detectors for
multiple-input-multiple-output (MIMO) orthogonal frequency division
multiplexing (OFDM) technique under specific and realistic operation system
condi- tions, including antenna correlation and array configuration.
Time-domain channel model has been used to evaluate the system performance
under realistic communication channel and system scenarios, including different
channel correlation, modulation order and antenna arrays configurations. A
bunch of MIMO-OFDM detectors were analyzed for the purpose of achieve high
performance combined with high capacity systems and manageable computational
complexity. Numerical Monte-Carlo simulations (MCS) demonstrate the channel
selectivity effect, while the impact of the number of antennas, adoption of
linear against heuristic-based detection schemes, and the spatial correlation
effect under linear and planar antenna arrays are analyzed in the MIMO-OFDM
context.Comment: 26 pgs, 16 figures and 5 table
Minimizing Test Power in SRAM through Reduction of Pre-charge Activity
In this paper we analyze the test power of SRAM memories and demonstrate that the full functional pre-charge activity is not necessary during test mode because of the predictable addressing sequence. We exploit this observation in order to minimize power dissipation during test by eliminating the unnecessary power consumption associated with the pre-charge activity. This is achieved through a modified pre-charge control circuitry, exploiting the first degree of freedom of March tests, which allows choosing a specific addressing sequence. The efficiency of the proposed solution is validated through extensive Spice simulations
Selection from read-only memory with limited workspace
Given an unordered array of elements drawn from a totally ordered set and
an integer in the range from to , in the classic selection problem
the task is to find the -th smallest element in the array. We study the
complexity of this problem in the space-restricted random-access model: The
input array is stored on read-only memory, and the algorithm has access to a
limited amount of workspace. We prove that the linear-time prune-and-search
algorithm---presented in most textbooks on algorithms---can be modified to use
bits instead of words of extra space. Prior to our
work, the best known algorithm by Frederickson could perform the task with
bits of extra space in time. Our result separates
the space-restricted random-access model and the multi-pass streaming model,
since we can surpass the lower bound known for the latter
model. We also generalize our algorithm for the case when the size of the
workspace is bits, where . The running time
of our generalized algorithm is ,
slightly improving over the
bound of Frederickson's algorithm. To obtain the improvements mentioned above,
we developed a new data structure, called the wavelet stack, that we use for
repeated pruning. We expect the wavelet stack to be a useful tool in other
applications as well.Comment: 16 pages, 1 figure, Preliminary version appeared in COCOON-201
Recommended from our members
Executing matrix multiply on a process oriented data flow machine
The Process-Oriented Dataflow System (PODS) is an execution model that combines the von Neumann and dataflow models of computation to gain the benefits of each. Central to PODS is the concept of array distribution and its effects on partitioning and mapping of processes.In PODS arrays are partitioned by simply assigning consecutive elements to each processing element (PE) equally. Since PODS uses single assignment, there will be only one producer of each element. This producing PE owns that element and will perform the necessary computations to assign it. Using this approach the filling loop is distributed across the PEs. This simple partitioning and mapping scheme provides excellent results for executing scientific code on MIMD machines. In this way PODS allows MIMD machines to exploit vector and data parallelism easily while still providing the flexibility of MIMD over SIMD for multi-user systems.In this paper, the classic matrix multiply algorithm, with 1024 data points, is executed on a PODS simulator and the results are presented and discussed. Matrix multiply is a good example because it has several interesting properties: there are multiple code-blocks; a new array must be dynamically allocated and distributed; there is a loop-carried dependency in the innermost loop; the two input arrays have different access patterns; and the sizes of the input arrays are not known at compile time. Matrix multiply also forms the basis for many important scientific algorithms such as: LU decomposition, convolution, and the Fast-Fourier Transform.The results show that PODS is comparable to both Iannucci's Hybrid Architecture and MIT's TTDA in terms of overhead and instruction power. They also show that PODS easily distributes the work load evenly across the PEs. The key result is that PODS can scale matrix multiply in a near linear fashion until there is little or no work to be performed for each PE. Then overhead and message passing become a major component of the execution time. With larger problems (e.g., >/=16k data points) this limit would be reached at around 256 PEs
Fast -NNG construction with GPU-based quick multi-select
In this paper we describe a new brute force algorithm for building the
-Nearest Neighbor Graph (-NNG). The -NNG algorithm has many
applications in areas such as machine learning, bio-informatics, and clustering
analysis. While there are very efficient algorithms for data of low dimensions,
for high dimensional data the brute force search is the best algorithm. There
are two main parts to the algorithm: the first part is finding the distances
between the input vectors which may be formulated as a matrix multiplication
problem. The second is the selection of the -NNs for each of the query
vectors. For the second part, we describe a novel graphics processing unit
(GPU) -based multi-select algorithm based on quick sort. Our optimization makes
clever use of warp voting functions available on the latest GPUs along with
use-controlled cache. Benchmarks show significant improvement over
state-of-the-art implementations of the -NN search on GPUs
Memory-Adjustable Navigation Piles with Applications to Sorting and Convex Hulls
We consider space-bounded computations on a random-access machine (RAM) where
the input is given on a read-only random-access medium, the output is to be
produced to a write-only sequential-access medium, and the available workspace
allows random reads and writes but is of limited capacity. The length of the
input is elements, the length of the output is limited by the computation,
and the capacity of the workspace is bits for some predetermined
parameter . We present a state-of-the-art priority queue---called an
adjustable navigation pile---for this restricted RAM model. Under some
reasonable assumptions, our priority queue supports and
in worst-case time and in worst-case time for any . We show how to use this
data structure to sort elements and to compute the convex hull of
points in the two-dimensional Euclidean space in
worst-case time for any . Following a known lower bound for the
space-time product of any branching program for finding unique elements, both
our sorting and convex-hull algorithms are optimal. The adjustable navigation
pile has turned out to be useful when designing other space-efficient
algorithms, and we expect that it will find its way to yet other applications.Comment: 21 page
- âŠ