Search CORE

48,371 research outputs found

Distributed Basis Pursuit

Author: Aguiar Pedro M. Q.
Mota João F. C.
Püschel Markus
Xavier João M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

We propose a distributed algorithm for solving the optimization problem Basis Pursuit (BP). BP finds the least L1-norm solution of the underdetermined linear system Ax = b and is used, for example, in compressed sensing for reconstruction. Our algorithm solves BP on a distributed platform such as a sensor network, and is designed to minimize the communication between nodes. The algorithm only requires the network to be connected, has no notion of a central processing node, and no node has access to the entire matrix A at any time. We consider two scenarios in which either the columns or the rows of A are distributed among the compute nodes. Our algorithm, named D-ADMM, is a decentralized implementation of the alternating direction method of multipliers. We show through numerical simulation that our algorithm requires considerably less communications between the nodes than the state-of-the-art algorithms.Comment: Preprint of the journal version of the paper; IEEE Transactions on Signal Processing, Vol. 60, Issue 4, April, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Heriot Watt Pure

Loop optimization for tensor network renormalization

Author: Gu Zheng-Cheng
Wen Xiao-Gang
Yang Shuo
Publication venue: 'American Physical Society (APS)'
Publication date: 25/02/2017
Field of study

We introduce a tensor renormalization group scheme for coarse-graining a two-dimensional tensor network that can be successfully applied to both classical and quantum systems on and off criticality. The key innovation in our scheme is to deform a 2D tensor network into small loops and then optimize the tensors on each loop. In this way, we remove short-range entanglement at each iteration step and significantly improve the accuracy and stability of the renormalization flow. We demonstrate our algorithm in the classical Ising model and a frustrated 2D quantum model.Comment: 15 pages, 11 figures, accepted version for Phys. Rev. Let

arXiv.org e-Print Archive

DSpace@MIT

Data-efficient learning of feedback policies from image pixels using deep dynamical models

Author: Assael J-AM
Deisenroth MP
Schön TB
Wahlström N
Publication venue
Publication date: 08/10/2015
Field of study

Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ( torques ) from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model for learning a low-dimensional feature embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning is crucial for long-term predictions, which lie at the core of the adaptive nonlinear model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art RL methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces, is lightweight and an important step toward fully autonomous end-to-end learning from pixels to torques

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

A Quality and Cost Approach for Comparison of Small-World Networks

Author: Demichev A.
Ilyin V.
Kryukov A.
Polyakov S.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 04/01/2013
Field of study

We propose an approach based on analysis of cost-quality tradeoffs for comparison of efficiency of various algorithms for small-world network construction. A number of both known in the literature and original algorithms for complex small-world networks construction are shortly reviewed and compared. The networks constructed on the basis of these algorithms have basic structure of 1D regular lattice with additional shortcuts providing the small-world properties. It is shown that networks proposed in this work have the best cost-quality ratio in the considered class.Comment: 27 pages, 16 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

Maximizing CNN Accelerator Efficiency Through Resource Partitioning

Author: Alwani M.
Krizhevsky Alex
Li Huimin
van den Oord Aäron
Publication venue
Publication date: 12/04/2018
Field of study

Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based accelerators have been proposed to improve the performance and efficiency of CNNs. Current approaches construct a single processor that computes the CNN layers one at a time; the processor is optimized to maximize the throughput at which the collection of layers is computed. However, this approach leads to inefficient designs because the same processor structure is used to compute CNN layers of radically varying dimensions. We present a new CNN accelerator paradigm and an accompanying automated design methodology that partitions the available FPGA resources into multiple processors, each of which is tailored for a different subset of the CNN convolutional layers. Using the same FPGA resources as a single large processor, multiple smaller specialized processors increase computational efficiency and lead to a higher overall throughput. Our design methodology achieves 3.8x higher throughput than the state-of-the-art approach on evaluating the popular AlexNet CNN on a Xilinx Virtex-7 FPGA. For the more recent SqueezeNet and GoogLeNet, the speedups are 2.2x and 2.0x

arXiv.org e-Print Archive

Crossref