13,656 research outputs found

    Hybrid analog-digital transmit beamforming for spectrum sharing backhaul networks

    Get PDF
    © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.This paper deals with the problem of analog-digital transmit beamforming under spectrum sharing constraints for backhaul systems. In contrast to fully digital designs, where the spatial processing is done at baseband unit with all the flexible computational resources of digital processors, analog-digital beamforming schemes require that certain processing is done through analog components, such as phase-shifters or switches. These analog components do not have the same processing flexibility as the digital processor, but on the other hand, they can substantially reduce the cost and complexity of the beamforming solution. This paper presents the joint optimization of the analog and digital parts, which results in a nonconvex, NP-hard, and coupled problem. In order to solve it, an alternating optimization with a penalized convex-concave method is proposed. According to the simulation results, this novel iterative procedure is able to find a solution that behaves close to the fully digital beamforming upper bound scheme.Peer ReviewedPostprint (author's final draft

    Flexible compiler-managed L0 buffers for clustered VLIW processors

    Get PDF
    Wire delays are a major concern for current and forthcoming processors. One approach to attack this problem is to divide the processor into semi-independent units referred to as clusters. A cluster usually consists of a local register file and a subset of the functional units, while the data cache remains centralized. However, as technology evolves, the latency of such a centralized cache increase leading to an important performance impact. In this paper, we propose to include flexible low-latency buffers in each cluster in order to reduce the performance impact of higher cache latencies. The reduced number of entries in each buffer permits the design of flexible ways to map data from L1 to these buffers. The proposed L0 buffers are managed by the compiler, which is responsible to decide which memory instructions make us of them. Effective instruction scheduling techniques are proposed to generate code that exploits these buffers. Results for the Mediabench benchmark suite show that the performance of a clustered VLIW processor with a unified L1 data cache is improved by 16% when such buffers are used. In addition, the proposed architecture also shows significant advantages over both MultiVLIW processors and clustered processors with a word-interleaved cache, two state-of-the-art designs with a distributed L1 data cache.Peer ReviewedPostprint (published version

    Distributed data cache designs for clustered VLIW processors

    Get PDF
    Wire delays are a major concern for current and forthcoming processors. One approach to deal with this problem is to divide the processor into semi-independent units referred to as clusters. A cluster usually consists of a local register file and a subset of the functional units, while the L1 data cache typically remains centralized in What we call partially distributed architectures. However, as technology evolves, the relative latency of such a centralized cache will increase, leading to an important impact on performance. In this paper, we propose partitioning the L1 data cache among clusters for clustered VLIW processors. We refer to this kind of design as fully distributed processors. In particular; we propose and evaluate three different configurations: a snoop-based cache coherence scheme, a word-interleaved cache, and flexible LO-buffers managed by the compiler. For each alternative, instruction scheduling techniques targeted to cyclic code are developed. Results for the Mediabench suite'show that the performance of such fully distributed architectures is always better than the performance of a partially distributed one with the same amount of resources. In addition, the key aspects of each fully distributed configuration are explored.Peer ReviewedPostprint (published version

    B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives

    Full text link
    Previous research addressed the potential problems of the hard-disk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest algorithm-design principles in order to best benefit from the internal parallelism. We present a new I/O request concept, called psync I/O that can exploit the internal parallelism of flashSSDs in a single process. Based on these ideas, we introduce B+-tree optimization methods in order to utilize internal parallelism. By integrating the results of these methods, we present a B+-tree variant, PIO B-tree. We confirmed that each optimization method substantially enhances the index performance. Consequently, PIO B-tree enhanced B+-tree's insert performance by a factor of up to 16.3, while improving point-search performance by a factor of 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. We also confirmed that PIO B-tree outperforms B+-tree in index traces collected inside the Postgresql DBMS with TPC-C benchmark.Comment: VLDB201

    Large-scale multilayer architecture of single-atom arrays with individual addressability

    Full text link
    We report on the realization of large-scale 3D multilayer configurations of planar arrays of individual neutral atoms with immediate applications in quantum science and technology: a microlens-generated Talbot optical lattice In this novel platform, the single-beam illumination of a microlens array constitutes a structurally robust and wavelength-universal method for the realization of 3D atom arrays with favourable scaling properties due to the inherent self-imaging of the focal structure. Thus, 3D scaling comes without the requirement of extra resources. We demonstrate the trapping and imaging of individual rubidium atoms and the in-plane assembly of defect-free single-atom arrays in several Talbot planes. We present interleaved lattices with dynamic position control and parallelized sub-lattice addressing of spin states

    Empirical Comparison of Chirp and Multitones on Experimental UWB Software Defined Radar Prototype

    Get PDF
    This paper proposes and tests an approach for an unbiased study of radar waveforms' performances. Using the ultrawide band software defined radar prototype, the performances of Chirp and Multitones are compared in range profile and detection range. The architecture was implemented and has performances comparable to the state of the art in software defined radar prototypes. The experimental results are consistent with the simulations
    corecore