1,872 research outputs found

    AirSync: Enabling Distributed Multiuser MIMO with Full Spatial Multiplexing

    Full text link
    The enormous success of advanced wireless devices is pushing the demand for higher wireless data rates. Denser spectrum reuse through the deployment of more access points per square mile has the potential to successfully meet the increasing demand for more bandwidth. In theory, the best approach to density increase is via distributed multiuser MIMO, where several access points are connected to a central server and operate as a large distributed multi-antenna access point, ensuring that all transmitted signal power serves the purpose of data transmission, rather than creating "interference." In practice, while enterprise networks offer a natural setup in which distributed MIMO might be possible, there are serious implementation difficulties, the primary one being the need to eliminate phase and timing offsets between the jointly coordinated access points. In this paper we propose AirSync, a novel scheme which provides not only time but also phase synchronization, thus enabling distributed MIMO with full spatial multiplexing gains. AirSync locks the phase of all access points using a common reference broadcasted over the air in conjunction with a Kalman filter which closely tracks the phase drift. We have implemented AirSync as a digital circuit in the FPGA of the WARP radio platform. Our experimental testbed, comprised of two access points and two clients, shows that AirSync is able to achieve phase synchronization within a few degrees, and allows the system to nearly achieve the theoretical optimal multiplexing gain. We also discuss MAC and higher layer aspects of a practical deployment. To the best of our knowledge, AirSync offers the first ever realization of the full multiuser MIMO gain, namely the ability to increase the number of wireless clients linearly with the number of jointly coordinated access points, without reducing the per client rate.Comment: Submitted to Transactions on Networkin

    Code transformations based on speculative SDC scheduling

    Get PDF
    Code motion and speculations are usually exploited in the High Level Synthesis of control dominated applications to improve the performances of the synthesized designs. Selecting the transformations to be applied is not a trivial task: their effects can indeed indirectly spread across the whole design, potentially worsening the quality of the results. In this paper we propose a code transformation flow, based on a new extension of the System of Difference Constraints (SDC) scheduling algorithm, which introduces a large number of transformations, whose profitability is guaranteed by SDC formulation. Experimental results show that the proposed technique in average reduces the execution time of control dominated applications by 37% with respect to a commercial tool without increasing the area usage

    Advances in ILP-based Modulo Scheduling for High-Level Synthesis

    Get PDF
    In today's heterogenous computing world, field-programmable gate arrays (FPGA) represent the energy-efficient alternative to generic processor cores and graphics accelerators. However, due to their radically different computing model, automatic design methods, such as high-level synthesis (HLS), are needed to harness their full power. HLS raises the abstraction level to behavioural descriptions of algorithms, thus freeing designers from dealing with tedious low-level concerns, and enabling a rapid exploration of different microarchitectures for the same input specification. In an HLS tool, scheduling is the most influential step for the performance of the generated accelerator. Specifically, modulo schedulers enable a pipelined execution, which is a key technique to speed up the computation by extracting more parallelism from the input description. In this thesis, we make a case for the use of integer linear programming (ILP) as a framework for modulo scheduling approaches. First, we argue that ILP-based modulo schedulers are practically usable in the HLS context. Secondly, we show that the ILP framework enables a novel approach for the automatic design of FPGA accelerators. We substantiate the first claim by proposing a new, flexible ILP formulation for the modulo scheduling problem, and evaluate it experimentally with a diverse set of realistic test instances. While solving an ILP may incur an exponential runtime in the worst case, we observe that simple countermeasures, such as setting a time limit, help to contain the practical impact of outlier instances. Furthermore, we present an algorithm to compress problems before the actual scheduling. An HLS-generated microarchitecture is comprised of operators, i.e. single-purpose functional units such as a floating-point multiplier. Usually, the allocation of operators is determined before scheduling, even though both problems are interdependent. To that end, we investigate an extension of the modulo scheduling problem that combines both concerns in a single model. Based on the extension, we present a novel multi-loop scheduling approach capable of finding the fastest microarchitecture that still fits on a given FPGA device - an optimisation problem that current commercial HLS tools cannot solve. This proves our second claim

    Optimality of Treating Interference as Noise: A Combinatorial Perspective

    Get PDF
    For single-antenna Gaussian interference channels, we re-formulate the problem of determining the Generalized Degrees of Freedom (GDoF) region achievable by treating interference as Gaussian noise (TIN) derived in [3] from a combinatorial perspective. We show that the TIN power control problem can be cast into an assignment problem, such that the globally optimal power allocation variables can be obtained by well-known polynomial time algorithms. Furthermore, the expression of the TIN-Achievable GDoF region (TINA region) can be substantially simplified with the aid of maximum weighted matchings. We also provide conditions under which the TINA region is a convex polytope that relax those in [3]. For these new conditions, together with a channel connectivity (i.e., interference topology) condition, we show TIN optimality for a new class of interference networks that is not included, nor includes, the class found in [3]. Building on the above insights, we consider the problem of joint link scheduling and power control in wireless networks, which has been widely studied as a basic physical layer mechanism for device-to-device (D2D) communications. Inspired by the relaxed TIN channel strength condition as well as the assignment-based power allocation, we propose a low-complexity GDoF-based distributed link scheduling and power control mechanism (ITLinQ+) that improves upon the ITLinQ scheme proposed in [4] and further improves over the heuristic approach known as FlashLinQ. It is demonstrated by simulation that ITLinQ+ provides significant average network throughput gains over both ITLinQ and FlashLinQ, and yet still maintains the same level of implementation complexity. More notably, the energy efficiency of the newly proposed ITLinQ+ is substantially larger than that of ITLinQ and FlashLinQ, which is desirable for D2D networks formed by battery-powered devices.Comment: A short version has been presented at IEEE International Symposium on Information Theory (ISIT 2015), Hong Kon

    High-level synthesis of fine-grained weakly consistent C concurrency

    Get PDF
    High-level synthesis (HLS) is the process of automatically compiling high-level programs into a netlist (collection of gates). Given an input program, HLS tools exploit its inherent parallelism and pipelining opportunities to generate efficient customised hardware. C-based programs are the most popular input for HLS tools, but these tools historically only synthesise sequential C programs. As the appeal for software concurrency rises, HLS tools are beginning to synthesise concurrent C programs, such as C/C++ pthreads and OpenCL. Although supporting software concurrency leads to better hardware parallelism, shared memory synchronisation is typically serialised to ensure correct memory behaviour, via locks. Locks are safety resources that ensure exclusive access of shared memory, eliminating data races and providing synchronisation guarantees for programmers.Ā  As an alternative to lock-based synchronisation, the C memory model also defines the possibility of lock-free synchronisation via fine-grained atomic operations (`atomics'). However, most HLS tools either do not support atomics at all or implement atomics using locks. Instead, we treat the synthesis of atomics as a scheduling problem. We show that we can augment the intra-thread memory constraints during memory scheduling of concurrent programs to support atomics. On average, hardware generated by our method is 7.5x faster than the state-of-the-art, for our set of experiments. Our method of synthesising atomics enables several unique possibilities. Chiefly, we are capable of supporting weakly consistent (`weak') atomics, which necessitate fewer ordering constraints compared to sequentially consistent (SC) atomics. However, implementing weak atomics is complex and error-prone and hence we formally verify our methods via automated model checking to ensure our generated hardware is correct. Furthermore, since the C memory model defines memory behaviour globally, we can globally analyse the entire program to generate its memory constraints. Additionally, we can also support loop pipelining by extending our methods to generate inter-iteration memory constraints. On average, weak atomics, global analysis and loop pipelining improve performance by 1.6x, 3.4x and 1.4x respectively, for our set of experiments. Finally, we present a case study of a real-world example via an HLS-based Google PageRank algorithm, whose performance improves by 4.4x via lock-free streaming and work-stealing.Open Acces

    Modelling and Analysis for Cyber-Physical Systems: An SMT-based approach

    Get PDF

    Autonomous Vehicle Coordination with Wireless Sensor and Actuator Networks

    Get PDF
    A coordinated team of mobile wireless sensor and actuator nodes can bring numerous benefits for various applications in the field of cooperative surveillance, mapping unknown areas, disaster management, automated highway and space exploration. This article explores the idea of mobile nodes using vehicles on wheels, augmented with wireless, sensing, and control capabilities. One of the vehicles acts as a leader, being remotely driven by the user, the others represent the followers. Each vehicle has a low-power wireless sensor node attached, featuring a 3D accelerometer and a magnetic compass. Speed and orientation are computed in real time using inertial navigation techniques. The leader periodically transmits these measures to the followers, which implement a lightweight fuzzy logic controller for imitating the leader's movement pattern. We report in detail on all development phases, covering design, simulation, controller tuning, inertial sensor evaluation, calibration, scheduling, fixed-point computation, debugging, benchmarking, field experiments, and lessons learned
    • ā€¦
    corecore