36 research outputs found
Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs
Trust region policy optimization (TRPO) is a popular and empirically
successful policy search algorithm in Reinforcement Learning (RL) in which a
surrogate problem, that restricts consecutive policies to be 'close' to one
another, is iteratively solved. Nevertheless, TRPO has been considered a
heuristic algorithm inspired by Conservative Policy Iteration (CPI). We show
that the adaptive scaling mechanism used in TRPO is in fact the natural "RL
version" of traditional trust-region methods from convex analysis. We first
analyze TRPO in the planning setting, in which we have access to the model and
the entire state space. Then, we consider sample-based TRPO and establish
convergence rate to the global optimum. Importantly, the
adaptive scaling mechanism allows us to analyze TRPO in regularized MDPs for
which we prove fast rates of , much like results in convex
optimization. This is the first result in RL of better rates when regularizing
the instantaneous cost or reward.Comment: Published at AAAI-2020 58 page
Using Wikipedia to boost collaborative filtering techniques
One important challenge in the field of recommender systems is the sparsity of available data. This problem limits the ability of recommender systems to provide accurate predictions of user ratings. We overcome this problem by using the publicly available user generated information contained in Wikipedia. We identify similarities between items by mapping them to Wikipedia pages and finding similarities in the text and commonalities in the links and categories of each page. These similarities can be used in the recommendation process and improve ranking predictions. We find that this method is most effective in cases where ratings are extremely sparse or nonexistent. Preliminary experimental results on the MovieLens dataset are encouraging
Diffusive and Ballistic Transport in Ultra-thin InSb Nanowire Devices Using a Few-layer-Graphene-AlOx Gate
Quantum devices based on InSb nanowires (NWs) are a prime candidate system
for realizing and exploring topologically-protected quantum states and for
electrically-controlled spin-based qubits. The influence of disorder on
achieving reliable topological regimes has been studied theoretically,
highlighting the importance of optimizing both growth and nanofabrication. In
this work we investigate both aspects. We developed InSb nanowires with
ultra-thin diameters, as well as a new gating approach, involving few-layer
graphene (FLG) and Atomic Layer Deposition (ALD)-grown AlOx. Low-temperature
electronic transport measurements of these devices reveal conductance plateaus
and Fabry-P\'erot interference, evidencing phase-coherent transport in the
regime of few quantum modes. The approaches developed in this work could help
mitigate the role of material and fabrication-induced disorder in
semiconductor-based quantum devices.Comment: 14 pages, 5 figure