17 research outputs found

    Parameter Sharing in Coagent Networks

    Full text link
    In this paper, we aim to prove the theorem that generalizes the Coagent Network Policy Gradient Theorem (Kostas et. al., 2019) to the context where parameters are shared among the function approximators involved. This provides the theoretical foundation to use any pattern of parameter sharing and leverage the freedom in the graph structure of the network to possibility exploit relational bias in a given task. As another application, we will apply our result to give a more intuitive proof for the Hierarchical Option Critic Policy Gradient Theorem, first shown in (Riemer et. al., 2019)

    The Smallest Interacting Universe

    Full text link
    The co-emergence of locality between the Hamiltonian and initial state of the universe is studied in a simple toy model. We hypothesize a fundamental loss functional for the combined Hamiltonian and quantum state and minimize it by gradient descent. This minimization yields a tensor product structure simultaneously respected by both the Hamiltonian and the state, suggesting that locality can emerge by a process analogous to spontaneous symmetry breaking. We discuss the relevance of this program to the arrow of time problem. In our toy model, we interpret the emergence of a tensor factorization as the appearance of individual degrees of freedom within a previously undifferentiated (raw) Hilbert space. Earlier work [5, 6] looked at the emergence of locality in Hamiltonians only, and found strong numerical confirmation of that raw Hilbert spaces of dim=n\dim = n are unstable and prefer to settle on tensor factorization when n=pqn=pq is not prime, and in [6] even primes were seen to "factor" after first shedding a small summand, e.g. 7=1+237=1+2\cdot 3. This was found in the context of a rather general potential functional FF on the space of metrics {gij}\{g_{ij}\} on su(n)\mathfrak{su}(n), the Lie algebra of symmetries. This emergence of qunits through operator-level spontaneous symmetry breaking (SSB) may help us understand why the world seems to consist of myriad interacting degrees of freedom. But understanding why the universe has an initial Hamiltonian H0H_0 with a many-body structure is of limited conceptual value unless the initial state, ψ0|\psi_0\rangle, is also structured by this tensor decomposition. Here we adapt FF to become a functional on {g,ψ0}=(metrics)×(initial states)\{g,|\psi_0\rangle\}=(\text{metrics})\times (\text{initial states}), and find SSB now produces a conspiracy between gg and ψ0|\psi_0\rangle, where they simultaneously attain low entropy by settling on the same qubit decomposition

    Quantum computing with Octonions

    Get PDF
    There are two schools of "measurement-only quantum computation". The first ([11]) using prepared entanglement (cluster states) and the second ([4]) using collections of anyons, which according to how they were produced, also have an entanglement pattern. We abstract the common principle behind both approaches and find the notion of a graph or even continuous family of equiangular projections. This notion is the leading character in the paper. The largest continuous family, in a sense made precise in Corollary 4.2, is associated with the octonions and this example leads to a universal computational scheme. Adiabatic quantum computation also fits into this rubric as a limiting case: nearby projections are nearly equiangular, so as a gapped ground state space is slowly varied the corrections to unitarity are small.Comment: Added some new results in section

    Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning

    Full text link
    In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings. We simplify the learning problem using a discrete set of surrogate environments, and present a refined analysis of the information ratio using posterior consistency. This leads to an upper bound of order O~(Hdl1T)\widetilde{O}(H\sqrt{d_{l_1}T}) in the time inhomogeneous reinforcement learning problem where HH is the episode length and dl1d_{l_1} is the Kolmogorov l1l_1-dimension of the space of environments. We then find concrete bounds of dl1d_{l_1} in a variety of settings, such as tabular, linear and finite mixtures, and discuss how how our results are either the first of their kind or improve the state-of-the-art.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS 2023

    Quantum simulation of battery materials using ionic pseudopotentials

    Full text link
    Ionic pseudopotentials are widely used in classical simulations of materials to model the effective potential due to the nucleus and the core electrons. Modeling fewer electrons explicitly results in a reduction in the number of plane waves needed to accurately represent the states of a system. In this work, we introduce a quantum algorithm that uses pseudopotentials to reduce the cost of simulating periodic materials on a quantum computer. We use a qubitization-based quantum phase estimation algorithm that employs a first-quantization representation of the Hamiltonian in a plane-wave basis. We address the challenge of incorporating the complexity of pseudopotentials into quantum simulations by developing highly-optimized compilation strategies for the qubitization of the Hamiltonian. This includes a linear combination of unitaries decomposition that leverages the form of separable pseudopotentials. Our strategies make use of quantum read-only memory subroutines as a more efficient alternative to quantum arithmetic. We estimate the computational cost of applying our algorithm to simulating lithium-excess cathode materials for batteries, where more accurate simulations are needed to inform strategies for gaining reversible access to the excess capacity they offer. We estimate the number of qubits and Toffoli gates required to perform sufficiently accurate simulations with our algorithm for three materials: lithium manganese oxide, lithium nickel-manganese oxide, and lithium manganese oxyfluoride. Our optimized compilation strategies result in a pseudopotential-based quantum algorithm with a total runtime four orders of magnitude lower than the previous state of the art for a fixed target accuracy
    corecore