256,642 research outputs found

    Power Allocation for Conventional and Buffer-Aided Link Adaptive Relaying Systems with Energy Harvesting Nodes

    Full text link
    Energy harvesting (EH) nodes can play an important role in cooperative communication systems which do not have a continuous power supply. In this paper, we consider the optimization of conventional and buffer-aided link adaptive EH relaying systems, where an EH source communicates with the destination via an EH decode-and-forward relay. In conventional relaying, source and relay transmit signals in consecutive time slots whereas in buffer-aided link adaptive relaying, the state of the source-relay and relay-destination channels determines whether the source or the relay is selected for transmission. Our objective is to maximize the system throughput over a finite number of transmission time slots for both relaying protocols. In case of conventional relaying, we propose an offline and several online joint source and relay transmit power allocation schemes. For offline power allocation, we formulate an optimization problem which can be solved optimally. For the online case, we propose a dynamic programming (DP) approach to compute the optimal online transmit power. To alleviate the complexity inherent to DP, we also propose several suboptimal online power allocation schemes. For buffer-aided link adaptive relaying, we show that the joint offline optimization of the source and relay transmit powers along with the link selection results in a mixed integer non-linear program which we solve optimally using the spatial branch-and-bound method. We also propose an efficient online power allocation scheme and a naive online power allocation scheme for buffer-aided link adaptive relaying. Our results show that link adaptive relaying provides performance improvement over conventional relaying at the expense of a higher computational complexity.Comment: Submitted to IEEE Transactions on Wireless Communication

    Batch Policy Learning under Constraints

    Get PDF
    When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting

    A Bandit Approach to Maximum Inner Product Search

    Full text link
    There has been substantial research on sub-linear time approximate algorithms for Maximum Inner Product Search (MIPS). To achieve fast query time, state-of-the-art techniques require significant preprocessing, which can be a burden when the number of subsequent queries is not sufficiently large to amortize the cost. Furthermore, existing methods do not have the ability to directly control the suboptimality of their approximate results with theoretical guarantees. In this paper, we propose the first approximate algorithm for MIPS that does not require any preprocessing, and allows users to control and bound the suboptimality of the results. We cast MIPS as a Best Arm Identification problem, and introduce a new bandit setting that can fully exploit the special structure of MIPS. Our approach outperforms state-of-the-art methods on both synthetic and real-world datasets.Comment: AAAI 201
    • …
    corecore