Search CORE

36 research outputs found

Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs

Author: Efroni Yonathan
Mannor Shie
Shani Lior
Publication venue
Publication date: 12/12/2019
Field of study

Trust region policy optimization (TRPO) is a popular and empirically successful policy search algorithm in Reinforcement Learning (RL) in which a surrogate problem, that restricts consecutive policies to be 'close' to one another, is iteratively solved. Nevertheless, TRPO has been considered a heuristic algorithm inspired by Conservative Policy Iteration (CPI). We show that the adaptive scaling mechanism used in TRPO is in fact the natural "RL version" of traditional trust-region methods from convex analysis. We first analyze TRPO in the planning setting, in which we have access to the model and the entire state space. Then, we consider sample-based TRPO and establish

\tilde O(1/\sqrt{N})

convergence rate to the global optimum. Importantly, the adaptive scaling mechanism allows us to analyze TRPO in regularized MDPs for which we prove fast rates of

\tilde O(1/N)

, much like results in convex optimization. This is the first result in RL of better rates when regularizing the instantaneous cost or reward.Comment: Published at AAAI-2020 58 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Using Wikipedia to boost collaborative filtering techniques

Author: Bracha Shapira
Gilad Katz
Guy Shani
Lior Rokach
Nir Ofek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

One important challenge in the field of recommender systems is the sparsity of available data. This problem limits the ability of recommender systems to provide accurate predictions of user ratings. We overcome this problem by using the publicly available user generated information contained in Wikipedia. We identify similarities between items by mapping them to Wikipedia pages and finding similarities in the text and commonalities in the links and categories of each page. These similarities can be used in the recommendation process and improve ranking predictions. We find that this method is most effective in cases where ratings are extremely sparse or nonexistent. Preliminary experimental results on the MovieLens dataset are encouraging

CiteSeerX

Crossref

Diffusive and Ballistic Transport in Ultra-thin InSb Nanowire Devices Using a Few-layer-Graphene-AlOx Gate

Author: Badawy Ghada
Bakkers Erik P. A. M.
Crowell Paul
Gupta Mohit
Hackbarth Frey
Jung Jason
Littman Tyler
Lueb Pim
Menning Gavin
Pribiag Vlad S.
Riggert Colin
Rossi Marco
Shani Lior
Verheijen Marcel A.
Publication venue
Publication date: 31/05/2023
Field of study

Quantum devices based on InSb nanowires (NWs) are a prime candidate system for realizing and exploring topologically-protected quantum states and for electrically-controlled spin-based qubits. The influence of disorder on achieving reliable topological regimes has been studied theoretically, highlighting the importance of optimizing both growth and nanofabrication. In this work we investigate both aspects. We developed InSb nanowires with ultra-thin diameters, as well as a new gating approach, involving few-layer graphene (FLG) and Atomic Layer Deposition (ALD)-grown AlOx. Low-temperature electronic transport measurements of these devices reveal conductance plateaus and Fabry-P\'erot interference, evidencing phase-coherent transport in the regime of few quantum modes. The approaches developed in this work could help mitigate the role of material and fabrication-induced disorder in semiconductor-based quantum devices.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive