134 research outputs found

    Competitive MA-DRL for Transmit Power Pool Design in Semi-Grant-Free NOMA Systems

    Full text link
    In this paper, we exploit the capability of multi-agent deep reinforcement learning (MA-DRL) technique to generate a transmit power pool (PP) for Internet of things (IoT) networks with semi-grant-free non-orthogonal multiple access (SGF-NOMA). The PP is mapped with each resource block (RB) to achieve distributed transmit power control (DPC). We first formulate the resource (sub-channel and transmit power) selection problem as stochastic Markov game, and then solve it using two competitive MA-DRL algorithms, namely double deep Q network (DDQN) and Dueling DDQN. Each GF user as an agent tries to find out the optimal transmit power level and RB to form the desired PP. With the aid of dueling processes, the learning process can be enhanced by evaluating the valuable state without considering the effect of each action at each state. Therefore, DDQN is designed for communication scenarios with a small-size action-state space, while Dueling DDQN is for a large-size case. Our results show that the proposed MA-Dueling DDQN based SGF-NOMA with DPC outperforms the SGF-NOMA system with the fixed-power-control mechanism and networks with pure GF protocols with 17.5% and 22.2% gain in terms of the system throughput, respectively. Moreover, to decrease the training time, we eliminate invalid actions (high transmit power levels) to reduce the action space. We show that our proposed algorithm is computationally scalable to massive IoT networks. Finally, to control the interference and guarantee the quality-of-service requirements of grant-based users, we find the optimal number of GF users for each sub-channel

    Intelligent Trajectory Design for RIS-NOMA aided Multi-robot Communications

    Full text link
    A novel reconfigurable intelligent surface-aided multi-robot network is proposed, where multiple mobile robots are served by an access point (AP) through non-orthogonal multiple access (NOMA). The goal is to maximize the sum-rate of whole trajectories for multi-robot system by jointly optimizing trajectories and NOMA decoding orders of robots, phase-shift coefficients of the RIS, and the power allocation of the AP, subject to predicted initial and final positions of robots and the quality of service (QoS) of each robot. To tackle this problem, an integrated machine learning (ML) scheme is proposed, which combines long short-term memory (LSTM)-autoregressive integrated moving average (ARIMA) model and dueling double deep Q-network (D3^{3}QN) algorithm. For initial and final position prediction for robots, the LSTM-ARIMA is able to overcome the problem of gradient vanishment of non-stationary and non-linear sequences of data. For jointly determining the phase shift matrix and robots' trajectories, D3^{3}QN is invoked for solving the problem of action value overestimation. Based on the proposed scheme, each robot holds a global optimal trajectory based on the maximum sum-rate of a whole trajectory, which reveals that robots pursue long-term benefits for whole trajectory design. Numerical results demonstrated that: 1) LSTM-ARIMA model provides high accuracy predicting model; 2) The proposed D3^{3}QN algorithm can achieve fast average convergence; 3) The RIS with higher resolution bits offers a bigger sum-rate of trajectories than lower resolution bits; and 4) RIS-NOMA networks have superior network performance compared to RIS-aided orthogonal counterparts

    Federated Learning for Energy-limited Wireless Networks: A Partial Model Aggregation Approach

    Full text link
    The limited communication resources, e.g., bandwidth and energy, and data heterogeneity across devices are two of the main bottlenecks for federated learning (FL). To tackle these challenges, we first devise a novel FL framework with partial model aggregation (PMA), which only aggregates the lower layers of neural networks responsible for feature extraction while the upper layers corresponding to complex pattern recognition remain at devices for personalization. The proposed PMA-FL is able to address the data heterogeneity and reduce the transmitted information in wireless channels. We then obtain a convergence bound of the framework under a non-convex loss function setting. With the aid of this bound, we define a new objective function, named the scheduled data sample volume, to transfer the original inexplicit optimization problem into a tractable one for device scheduling, bandwidth allocation, computation and communication time division. Our analysis reveals that the optimal time division is achieved when the communication and computation parts of PMA-FL have the same power. We also develop a bisection method to solve the optimal bandwidth allocation policy and use the set expansion algorithm to address the optimal device scheduling. Compared with the state-of-the-art benchmarks, the proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets, i.e., MINIST and CIFAR-10, respectively. In addition, the proposed joint dynamic device scheduling and resource optimization approach achieve slightly higher accuracy than the considered benchmarks, but they provide a satisfactory energy and time reduction: 29% energy or 20% time reduction on the MNIST; and 25% energy or 12.5% time reduction on the CIFAR-10.Comment: 32pages, 7 figure

    STAR-IOS Aided NOMA Networks: Channel Model Approximation and Performance Analysis

    Get PDF
    Simultaneous transmitting and reflecting intelligent omini-surfaces (STAR-IOSs) are able to achieve full coverage "smart radio environments". By splitting the energy or altering the active number of STAR-IOS elements, STAR-IOSs provide high flexibility of successive interference cancellation (SIC) orders for non-orthogonal multiple access (NOMA) systems. Based on the aforementioned advantages, this paper investigates a STAR-IOS-aided downlink NOMA network with randomly deployed users. We first propose three tractable channel models for different application scenarios, namely the central limit model, the curve fitting model, and the M-fold convolution model. More specifically, the central limit model fits the scenarios with large-size STAR-IOSs while the curve fitting model is extended to evaluate multi-cell networks. However, these two models cannot obtain accurate diversity orders. Hence, we figure out the M-fold convolution model to derive accurate diversity orders. We consider three protocols for STAR-IOSs, namely, the energy splitting (ES) protocol, the time switching (TS) protocol, and the mode switching (MS) protocol. Based on the ES protocol, we derive analytical outage probability expressions for the paired NOMA users by the central limit model and the curve fitting model. Based on three STAR-IOS protocols, we derive the diversity gains of NOMA users by the M-fold convolution model. The analytical results reveal that the diversity gain of NOMA users is equal to the active number of STAR-IOS elements. Numerical results indicate that 1) in high signal-to-noise ratio regions, the central limit model performs as an upper bound, while a lower bound is obtained by the curve fitting model; 2) the TS protocol has the best performance but requesting more time blocks than other protocols; 3) the ES protocol outperforms the MS protocol as the ES protocol has higher diversity gains

    DRL Enabled Coverage and Capacity Optimization in STAR-RIS Assisted Networks

    Full text link
    Simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) is a promising passive device that contributes to a full-space coverage via transmitting and reflecting the incident signal simultaneously. As a new paradigm in wireless communications, how to analyze the coverage and capacity performance of STAR-RISs becomes essential but challenging. To solve the coverage and capacity optimization (CCO) problem in STAR-RIS assisted networks, a multi-objective proximal policy optimization (MO-PPO) algorithm is proposed to handle long-term benefits than conventional optimization algorithms. To strike a balance between each objective, the MO-PPO algorithm provides a set of optimal solutions to form a Pareto front (PF), where any solution on the PF is regarded as an optimal result. Moreover, in order to improve the performance of the MO-PPO algorithm, two update strategies, i.e., action-value-based update strategy (AVUS) and loss function-based update strategy (LFUS), are investigated. For the AVUS, the improved point is to integrate the action values of both coverage and capacity and then update the loss function. For the LFUS, the improved point is only to assign dynamic weights for both loss functions of coverage and capacity, while the weights are calculated by a min-norm solver at every update. The numerical results demonstrated that the investigated update strategies outperform the fixed weights MO optimization algorithms in different cases, which includes a different number of sample grids, the number of STAR-RISs, the number of elements in the STAR-RISs, and the size of STAR-RISs. Additionally, the STAR-RIS assisted networks achieve better performance than conventional wireless networks without STAR-RISs. Moreover, with the same bandwidth, millimeter wave is able to provide higher capacity than sub-6 GHz, but at a cost of smaller coverage.Comment: arXiv admin note: text overlap with arXiv:2204.0639
    • …
    corecore