68 research outputs found
Dilated Convolution based CSI Feedback Compression for Massive MIMO Systems
Although the frequency-division duplex (FDD) massive multiple-input
multiple-output (MIMO) system can offer high spectral and energy efficiency, it
requires to feedback the downlink channel state information (CSI) from users to
the base station (BS), in order to fulfill the precoding design at the BS.
However, the large dimension of CSI matrices in the massive MIMO system makes
the CSI feedback very challenging, and it is urgent to compress the feedback
CSI. To this end, this paper proposes a novel dilated convolution based CSI
feedback network, namely DCRNet. Specifically, the dilated convolutions are
used to enhance the receptive field (RF) of the proposed DCRNet without
increasing the convolution size. Moreover, advanced encoder and decoder blocks
are designed to improve the reconstruction performance and reduce computational
complexity as well. Numerical results are presented to show the superiority of
the proposed DCRNet over the conventional networks. In particular, the proposed
DCRNet can achieve almost the state-of-the-arts (SOTA) performance with much
lower floating point operations (FLOPs). The open source code and checkpoint of
this work are available at https://github.com/recusant7/DCRNet.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Optimality study of resource binding with multi-Vdds
Deploying multiple supply voltages (multi-Vdds) on one chip is an important technique to reduce dynamic power consumption. In this work we present an optimality study for resource binding targeting designs with multi-Vdds. This is similar to the voltage-island design concept, except that the granularity of our voltage island is on the functional-unit level as opposed to the core level. We are interested in achieving the maximum number of low-Vdd operations and, in the same time, minimizing switching activity during functional unit binding. To the best of our knowledge, there is no known optimal solution to this problem. To compute an optimal solution for this problem and examine the quality gap between our solution and previous heuristic solutions, we formulate this problem as a min-cost network flow problem, but with special equal-flow constraints. This formulation leads to an easy reduction to the integer linear programming (ILP) solution and also enables efficient approximate solution by Lagrangian relaxation. Experimental results show that the optimal solution computed based on our formulation provides 7% more low-Vdd operations and also reduces the total switching activity by 20% compared to one of the best known heuristic algorithms that consider multi-Vdd assignments only. Copyright 2006 ACM.EI
Characterization and gene expression patterns analysis implies BSK family genes respond to salinity stress in cotton
Identification, evolution, and expression patterns of BSK (BR signaling kinase) family genes revealed that BSKs participated in the response of cotton to abiotic stress and maintained the growth of cotton in extreme environment. The steroidal hormone brassinosteroids (BR) play important roles in different plant biological processes. This study focused on BSK which were downstream regulatory element of BR, in order to help to decipher the functions of BSKs genes from cotton on growth development and responses to abiotic stresses and lean the evolutionary relationship of cotton BSKs. BSKs are a class of plant-specific receptor-like cytoplasmic kinases involved in BR signal transduction. In this study, bioinformatics methods were used to identify the cotton BSKs gene family at the cotton genome level, and the gene structure, promoter elements, protein structure and properties, gene expression patterns and candidate interacting proteins were analyzed. In the present study, a total of 152 BSKs were identified by a genome-wide search in four cotton species and other 11 plant species, and phylogenetic analysis revealed three evolutionary clades. It was identified that BSKs contain typical PKc and TPR domains, the N-terminus is composed of extended chains and helical structures. Cotton BSKs genes show different expression patterns in different tissues and organs. The gene promoter contains numerous cis-acting elements induced by hormones and abiotic stress, the hormone ABA and Cold-inducing related elements have the highest count, indicating that cotton BSK genes may be regulated by various hormones at different growth stages and involved in the response regulation of cotton to various stresses. The expression analysis of BSKs in cotton showed that the expression levels of GhBSK06, GhBSK10, GhBSK21 and GhBSK24 were significantly increased with salt-inducing. This study is helpful to analyze the function of cotton BSKs genes in growth and development and in response to stress
Simultaneous FU and Register Binding Based on Network Flow Method
Abstract – With the rapid increase of design complexity and the decrease of device features in nano-scale technologies, interconnection optimization in digital systems becomes more and more important. In this paper we develop a simultaneous FU and register (SFR) binding algorithm for multiplexer optimization based on min-cost network flow. Unlike most of the prior approaches in which functional unit binding and register binding are performed sequentially, our approach performs these two highly correlated tasks gradually and concurrently. We also present an ILP formulation of the combined functional unit and register binding problem for the optimality study of heuristics. Experimental results show that when compared to traditional binding algorithms, our simultaneous resource binding algorithm is close to optimal solutions for small-size designs (only 5 % more MUX) and achieves significant reduction for MUX area (12%) and timing (10%) for a set of real-life benchmark designs. I
Lower-Bound Estimation for Multi-Bitwidth Scheduling
Abstract--In high-level synthesis, accurate lower-bound estimation is helpful to explore the search space efficiently and to evaluate the quality of heuristic algorithms. For the lower-bound estimation of the scheduling problems, previous works mainly focus on the number of resources with uniform bitwidth. In this paper, we study the problem of lower-bound estimation on bitwidth summation of functional units for multi-bitwidth scheduling, where data-paths are composed of operations with various bitwidth. An integer linear programming (ILP) formulation and a polynomial time algorithm are presented. Experimental results indicate that the proposed algorithm produces good estimation, only 2% lower than the optimal results, which are obtained from ILP. I
Optimal Module and Voltage Assignment for Low-Power
Abstract – Reducing power consumption through high-level synthesis has attracted a growing interest from researchers due to its large potential for power reduction. In this work we study functional unit binding (or module assignment) given a scheduled data flow graph under a dual-Vdd framework. We assume that each functional unit can be driven by a low Vdd or a high Vdd dynamically during run time to save dynamic power. We develop a polynomial-time optimal algorithm for assigning low Vdd to as many operations as possible under the resource and time constraint, and in the same time minimizing total switching activity through functional unit binding. Our algorithm shows consistent improvement over a design flow that separates voltage assignment from functional unit binding. We also change the initial scheduling to examine power-latency tradeoff scenarios. Experimental results show that we can achieve a 21 % power reduction when latency bound is tight. When latency is relaxed by 10 to 100%, the power reduction is 31 to 73 % compared to the synthesis results for the case of single high Vdd without latency relaxation. We also show comparison data of energy consumption under the same experimental setting. I
Coordinated Resource Optimization in Behavioral Synthesis
Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers. To optimize the overall resource usage, a behavioral synthesis tool should consider all kinds of components at the same time. However, most previous work on behavioral synthesis has the limitations of (i) not being able to consider all kinds of resources globally, and/or (ii) separating the synthesis process into a sequence of optimization steps without a consistent optimization objective. In this paper we present a behavioral synthesis flow in which all types of components in the datapath are modeled and optimized consistently. The key idea is to feed to the scheduler the intentions for sharing functional units and registers in favor of the global optimization goal (such as total area), so that the scheduler could generate a schedule that makes the sharing intentions feasible. Experiments show that compared to the solution of minimizing functional unit requirements in scheduling and using the least number of functional units and registers in binding, our solution achieves a 24 % reduction in total area; compared to the online tool provided by c-to-verilog.com, our solution achieves a 30% reduction on average. I
Optimal Simultaneous Module and Multi-Voltage Assignment for Low Power
Reducing power consumption through high-level synthesis has attracted a growing interest from researchers due to its large potential for power reduction. In this work we study functional unit binding (or module assignment) given a scheduled data flow graph under a multi-Vdd framework. We assume that each functional unit can be driven by different Vdd levels dynamically during run time to save dynamic power. We develop a polynomial-time optimal algorithm for assigning low Vdds to as many operations as possible under the resource and latency constraints, and in the same time minimizing total switching activity through functional unit binding. Our algorithm shows consistent improvement over a design flow that separates voltage assignment from functional unit binding. We also change the initial scheduling to examine power/energy-latency tradeoff scenarios under different voltage level combinations. Experimental results show that we can achieve 28.1% and 33.4 % power reductions when the latency bound is the tightest with two and three-Vdd levels respectively compared with the single-Vdd case. When latency is relaxed, multi-Vdd offers larger power reductions (up to 46.7%). We also show comparison data of energy consumption under the same experimental settings
- …