2,488 research outputs found

    High-Throughput Soft-Output MIMO Detector Based on Path-Preserving Trellis-Search Algorithm

    Get PDF
    In this paper, we propose a novel path-preserving trellis-search (PPTS) algorithm and its high-speed VLSI architecture for soft-output multiple-input-multiple-output (MIMO) detection. We represent the search space of the MIMO signal with an unconstrained trellis, where each node in stage of the trellis maps to a possible complex-valued symbol transmitted by antenna. Based on the trellis model, we convert the soft-output MIMO detection problem into a multiple shortest paths problem subject to the constraint that every trellis node must be covered in this set of paths. The PPTS detector is guaranteed to have soft information for every possible symbol transmitted on every antenna so that the log-likelihood ratio (LLR) for each transmitted data bit can be more accurately formed. Simulation results show that the PPTS algorithm can achieve near-optimal error performance with a low search complexity. The PPTS algorithm is a hardware-friendly data-parallel algorithm because the search operations are evenly distributed among multiple trellis nodes for parallel processing. As a case study, we have designed and synthesized a fully-parallel systolic-array detector and two folded detectors for a 4x4 16-QAM system using a 1.08 V TSMC 65-nm CMOS technology.With a 1.18 mm2 core area, the folded detector can achieve a throughput of 2.1 Gbps.With a 3.19 mm2 core area, the fully-parallel systolic-array detector can achieve a throughput of 6.4 Gbps

    Optimum non linear binary image restoration through linear grey-scale operations

    Get PDF
    Non-linear image processing operators give excellent results in a number of image processing tasks such as restoration and object recognition. However they are frequently excluded from use in solutions because the system designer does not wish to introduce additional hardware or algorithms and because their design can appear to be ad hoc. In practice the median filter is often used though it is rarely optimal. This paper explains how various non-linear image processing operators may be implemented on a basic linear image processing system using only convolution and thresholding operations. The paper is aimed at image processing system developers wishing to include some non-linear processing operators without introducing additional system capabilities such as extra hardware components or software toolboxes. It may also be of benefit to the interested reader wishing to learn more about non-linear operators and alternative methods of design and implementation. The non-linear tools include various components of mathematical morphology, median and weighted median operators and various order statistic filters. As well as describing novel algorithms for implementation within a linear system the paper also explains how the optimum filter parameters may be estimated for a given image processing task. This novel approach is based on the weight monotonic property and is a direct rather than iterated method

    Communication architectures for single-chip data routers.

    Get PDF
    This dissertation presents a novel architectural technique for systolic architectures for applications which traditionally use high wire organizations in VLSI. Following a review of current VLSI research and VLSI models, this dissertation argues for a particular computational model (Chazelle\u27s model) as being appropriate for today\u27s VLSI and ULSI technology. Systolic arrays are particularly suited for applications where only local interprocessor communication of data is required. In areas where non local data communication is predominant, the so called high wire organizations are traditionally used. Such networks include sorting arrays, interconnection arrays. Using Chazelle\u27s model, an analysis of well known interconnection networks shows that inefficient systolic arrays, for routing and for sorting, outperform, so far as asymptotic performance metrics are concerned, high wire organizations traditionally used for such applications. This dissertation then proposes a new systolic architecture using the novel design philosophy of locally long but globally short connections. This involves designing arrays using large, complex cells instead of fine grained cells. This is termed systolic architectures using cells of controllable complexity since the latency and/or pipeline period requirement of a user determines the size and hence the interconnection complexity of the cells in a systolic array of complex cells. It turns out that many important application areas (e.g., interconnection networks, sorting networks and FFT) are suitable candidates for this approach. This class of architectures is well suited for ULSI implementation. An experiment in designing interconnection networks show that this concept of arrays using cells of controllable complexity is useful, even in current 1.2μ\mu VLSI technology.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1992 .P358. Source: Dissertation Abstracts International, Volume: 53-12, Section: B, page: 6471. Co-Supervisors: G. A. Jullien; D. Bandyopadhay. Thesis (Ph.D.)--University of Windsor (Canada), 1991

    VLSI-sorting evaluated under the linear model

    Get PDF
    AbstractThere are several different models of computation used on which to base evaluations of VLSI sorting algorithms and there are different measures of complexity. This paper revises complexity results under the linear model that have been gained under the constant model. This approach is due to expected technological development (see Mangir, 1983; Thompson and Raghavan, 1984; Vitanyi, 1984a, 1984b).For the constant model we know that for medium sized keys there are AT2and AP2 optimal sorting algorithms with T ranging from ω(log n) to O(√nk) and P ranging from Ω(1) to O(√nk) (Bilardi, 1984). The main results of asymptotic analysis of sorting algorithms under the linear model are that the lower bounds allow AT2 optimal sorting algorithms only for T = Θ(√nk) but allow AP2 algorithms in the same range as under the constant model. Furthermore the sorting algorithms presented in this paper meet these lower bounds. This proves that these bounds cannot be improved for k = Θ (log n). The building block for the realization of these sorting algorithms is a comparison exchange module that compares r × s bit matrices in time TC = Θ(r + s) on an area AC = Θ(r2) (not including the storage area for the keys).For problem sizes that exceed realistic chip capacities, chip-external sorting algorithms can be used. In this paper two different chip-external sorting algorithms (BBB(S) and TWB(S)) are presented. They are designed to be implemented on a single board. They use a sorting chip S to perform the sort-split operation on blocks of data BBB(S) and TWB(S) are systolic algorithms using local communication only so that their evaluation does not depend on whether the constant or the linear model is used. Furthermore it seems obvious that their design is technically feasible whenever the sorting chip S is technically feasible.TWB has optimal asymptotic time complexity, so its existence proves that under the linear model external sorting can be done asymptotically as fast as under the constant model. The time complexity of TWB(S) is linearly dependent on the speed gs = nsts. It is shown that the speed if looked at as a function of the chip capacity C is asymptotically maximal for AT2 optimal sorting algorithms. Thus S should be a sorting algorithm similar to the M-M-sorter presented in this paper. A major disadvantage of TWB(S) is that it cannot exploit the maximal throughput ds = ns/ps of a systolic sorting algorithm S.Therefore algorithm BBB(S) is introduced. The time complexity of BBB(S) is linearly dependent on ds. It is shown that the throughput is maximal for AP2 optimal algorithms. There is a wide range of such sorting algorithms including algorithms that can be realized in a way that is independent of the length of the keys. For example, BBB(S) with S being a highly parallel version of odd-even transposition sort has this kind of flexibility. A disadvantage of BBB(S) is that it is asymptotically slower than TWB(S)

    Design of testbed and emulation tools

    Get PDF
    The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems
    • …
    corecore