Search CORE

52,024 research outputs found

Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs

Author: Chandra Siddhartha
Kokkinos Iasonas
Publication venue
Publication date: 07/09/2016
Field of study

In this work we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a) our structured prediction task has a unique global optimum that is obtained exactly from the solution of a linear system (b) the gradients of our model parameters are analytically computed using closed form expressions, in contrast to the memory-demanding contemporary deep structured prediction approaches that rely on back-propagation-through-time, (c) our pairwise terms do not have to be simple hand-crafted expressions, as in the line of works building on the DenseCRF, but can rather be `discovered' from data through deep architectures, and (d) out system can trained in an end-to-end manner. Building on standard tools from numerical analysis we develop very efficient algorithms for inference and learning, as well as a customized technique adapted to the semantic segmentation task. This efficiency allows us to explore more sophisticated architectures for structured prediction in deep learning: we introduce multi-resolution architectures to couple information across scales in a joint optimization framework, yielding systematic improvements. We demonstrate the utility of our approach on the challenging VOC PASCAL 2012 image segmentation benchmark, showing substantial improvements over strong baselines. We make all of our code and experiments available at {https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Exascale Deep Learning for Climate Analytics

Author: Deslippe Jack
Fatica Massimiliano
Houston Michael
Kurth Thorsten
Luehr Nathan
Mahesh Ankur
Matheson Michael
Mudigonda Mayur
Phillips Everett
Prabhat
Romero Joshua
Treichler Sean
Publication venue
Publication date: 03/10/2018
Field of study

We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.Comment: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, US

arXiv.org e-Print Archive

eScholarship - University of California

Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs

Author: Kar Soummya
Moura Jose' M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/11/2010
Field of study

The paper considers gossip distributed estimation of a (static) distributed random field (a.k.a., large scale unknown parameter vector) observed by sparsely interconnected sensors, each of which only observes a small fraction of the field. We consider linear distributed estimators whose structure combines the information \emph{flow} among sensors (the \emph{consensus} term resulting from the local gossiping exchange among sensors when they are able to communicate) and the information \emph{gathering} measured by the sensors (the \emph{sensing} or \emph{innovations} term.) This leads to mixed time scale algorithms--one time scale associated with the consensus and the other with the innovations. The paper establishes a distributed observability condition (global observability plus mean connectedness) under which the distributed estimates are consistent and asymptotically normal. We introduce the distributed notion equivalent to the (centralized) Fisher information rate, which is a bound on the mean square error reduction rate of any distributed estimator; we show that under the appropriate modeling and structural network communication conditions (gossip protocol) the distributed gossip estimator attains this distributed Fisher information rate, asymptotically achieving the performance of the optimal centralized estimator. Finally, we study the behavior of the distributed gossip estimator when the measurements fade (noise variance grows) with time; in particular, we consider the maximum rate at which the noise variance can grow and still the distributed estimator being consistent, by showing that, as long as the centralized estimator is consistent, the distributed estimator remains consistent.Comment: Submitted for publication, 30 page

arXiv.org e-Print Archive

CiteSeerX

A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme

Author: Andrea
Huiyi Hu
L. Bertozzi
Mason A. Porter
Thomas Laurent
Publication venue
Publication date: 01/01/2013
Field of study

The study of network structure is pervasive in sociology, biology, computer science, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups of nodes called "communities". One popular approach to find communities is to maximize a quality function known as {\em modularity} to achieve some sort of optimal clustering of nodes. In this paper, we interpret the modularity function from a novel perspective: we reformulate modularity optimization as a minimization problem of an energy functional that consists of a total variation term and an

\ell_2

balance term. By employing numerical techniques from image processing and

\ell_1

compressive sensing -- such as convex splitting and the Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the minimization problem. We present our computational results using both synthetic benchmark networks and real data.Comment: 23 page

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Loyola Marymount University