1,683 research outputs found

    Bounded perturbation resilience of extragradient-type methods and their applications

    Full text link
    In this paper we study the bounded perturbation resilience of the extragradient and the subgradient extragradient methods for solving variational inequality (VI) problem in real Hilbert spaces. This is an important property of algorithms which guarantees the convergence of the scheme under summable errors, meaning that an inexact version of the methods can also be considered. Moreover, once an algorithm is proved to be bounded perturbation resilience, superiorizion can be used, and this allows flexibility in choosing the bounded perturbations in order to obtain a superior solution, as well explained in the paper. We also discuss some inertial extragradient methods. Under mild and standard assumptions of monotonicity and Lipschitz continuity of the VI's associated mapping, convergence of the perturbed extragradient and subgradient extragradient methods is proved. In addition we show that the perturbed algorithms converges at the rate of O(1/t)O(1/t). Numerical illustrations are given to demonstrate the performances of the algorithms.Comment: Accepted for publication in The Journal of Inequalities and Applications. arXiv admin note: text overlap with arXiv:1711.01936 and text overlap with arXiv:1507.07302 by other author

    Near-Optimal Differentially Private Reinforcement Learning

    Full text link
    Motivated by personalized healthcare and other applications involving sensitive data, we study online exploration in reinforcement learning with differential privacy (DP) constraints. Existing work on this problem established that no-regret learning is possible under joint differential privacy (JDP) and local differential privacy (LDP) but did not provide an algorithm with optimal regret. We close this gap for the JDP case by designing an ϵ\epsilon-JDP algorithm with a regret of O~(SAH2T+S2AH3/ϵ)\widetilde{O}(\sqrt{SAH^2T}+S^2AH^3/\epsilon) which matches the information-theoretic lower bound of non-private learning for all choices of ϵ>S1.5A0.5H2/T\epsilon> S^{1.5}A^{0.5} H^2/\sqrt{T}. In the above, SS, AA denote the number of states and actions, HH denotes the planning horizon, and TT is the number of steps. To the best of our knowledge, this is the first private RL algorithm that achieves \emph{privacy for free} asymptotically as T→∞T\rightarrow \infty. Our techniques -- which could be of independent interest -- include privately releasing Bernstein-type exploration bonuses and an improved method for releasing visitation statistics. The same techniques also imply a slightly improved regret bound for the LDP case.Comment: 38 page

    Performance Analysis of Indoor THz Communications with One-Bit Precoding

    Get PDF
    In this paper, the performance of indoor Terahertz (THz) communication systems with one-bit digital-to- analog converters (DACs) is investigated. Array-of- subarrays architecture is assumed for the antennas at the access points, where each RF chain uniquely activates a disjoint subset of antennas, each of which is connected to an exclusive phase shifter. Hybrid precoding, including maximum ratio transmission (MRT) and zero-forcing (ZF) precoding, is considered. The best beamsteering direction for the phase shifter in the large subarray antenna regime is first proved to be the direction of the line-of-sight (LoS) path. Subsequently, the closed-form expression of the lower- bound of the achievable rate in the large subarray antenna regime is derived, which is the same for both MRT and ZF and is independent of the transmit power. Numerical results validating the analysis are provided as well
    • …
    corecore