318 research outputs found
Data-efficient learning of feedback policies from image pixels using deep dynamical models
Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ( torques ) from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model for learning a low-dimensional feature embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning is crucial for long-term predictions, which lie at the core of the adaptive nonlinear model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art RL methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces, is lightweight and an important step toward fully autonomous end-to-end learning from pixels to torques
Parameterized Complexity and Kernelizability of Max Ones and Exact Ones Problems
For a finite set Gamma of Boolean relations, MAX ONES SAT(Gamma) and EXACT ONES SAT(Gamma) are generalized satisfiability problems where every constraint relation is from Gamma, and the task is to find a satisfying assignment with at least/exactly k variables set to 1, respectively. We study the parameterized complexity of these problems, including the question whether they admit polynomial kernels. For MAX ONES SAT(Gamma), we give a classification into five different complexity levels: polynomial-time solvable, admits a polynomial kernel, fixed-parameter tractable, solvable in polynomial time for fixed k, and NP-hard already for k = 1. For EXACT ONES SAT(Gamma), we refine the classification obtained earlier by taking a closer look at the fixed-parameter tractable cases and classifying the sets Gamma for which EXACT ONES SAT(Gamma) admits a polynomial kernel
Fixed-Parameter Tractability of Multicut in Directed Acyclic Graphs
The Multicut problem, given a graph G, a set of terminal pairs , and an integer , asks whether one can find a cutset consisting of at most nonterminal vertices that separates all the terminal pairs, i.e., after removing the cutset, is not reachable from for each . The fixed-parameter tractability of Multicut in undirected graphs, parameterized by the size of the cutset only, has been recently proved by Marx and Razgon [SIAM J. Comput., 43 (2014), pp. 355--388] and, independently, by Bousquet, Daligault, and Thomassé [Proceedings of STOC, ACM, 2011, pp. 459--468], after resisting attacks as a long-standing open problem. In this paper we prove that Multicut is fixed-parameter tractable on directed acyclic graphs when parameterized both by the size of the cutset and the number of terminal pairs. We complement this result by showing that this is implausible for parameterization by the size of the cutset only, as this version of the problem remains -hard
Au/TiO2(110) interfacial reconstruction stability from ab initio
We determine the stability and properties of interfaces of low-index Au
surfaces adhered to TiO2(110), using density functional theory energy density
calculations. We consider Au(100) and Au(111) epitaxies on rutile TiO2(110)
surface, as observed in experiments. For each epitaxy, we consider several
different interfaces: Au(111)//TiO2(110) and Au(100)//TiO2(110), with and
without bridging oxygen, Au(111) on 1x2 added-row TiO2(110) reconstruction, and
Au(111) on a proposed 1x2 TiO reconstruction. The density functional theory
energy density method computes the energy changes on each of the atoms while
forming the interface, and evaluates the work of adhesion to determine the
equilibrium interfacial structure.Comment: 20 pages, 11 figure
Bonding of gold nanoclusters to oxygen vacancies on rutile TiO2(110)
Through an interplay between scanning tunneling microscopy (STM) and density functional theory (DFT) calculations, we show that bridging oxygen vacancies are the active nucleation sites for Au clusters on the rutile TiO2(110) surface. We find that a direct correlation exists between a decrease in density of vacancies and the amount of Au deposited. From the DFT calculations we find that the oxygen vacancy is indeed the strongest Au binding site. We show both experimentally and theoretically that a single oxygen vacancy can bind 3 Au atoms on average. In view of the presented results, a new growth model for the TiO2(110) system involving vacancy-cluster complex diffusion is presented
Controlling the spectrum of x-rays generated in a laser-plasma accelerator by tailoring the laser wavefront
By tailoring the wavefront of the laser pulse used in a laser-wakefield
accelerator, we show that the properties of the x-rays produced due to the
electron beam's betatron oscillations in the plasma can be controlled. By
creating a wavefront with coma, we find that the critical energy of the
synchrotron-like x-ray spectrum can be significantly increased. The coma does
not substantially change the energy of the electron beam, but does increase its
divergence and produces an energy-dependent exit angle, indicating that changes
in the x-ray spectrum are due to an increase in the electron beam's oscillation
amplitude within the wakefield.Comment: 7 pages, 2 figures, submitted to Appl. Phys. Let
- …