Search CORE

168 research outputs found

Memory-efficient array redistribution through portable collective communication

Author: Paszke Adam
Rink Norman A.
Schmid Georg Stefan
Vytiniotis Dimitrios
Publication venue
Publication date: 28/11/2022
Field of study

Modern large-scale deep learning workloads highlight the need for parallel execution across many devices in order to fit model data into hardware accelerator memories. In these settings, array redistribution may be required during a computation, but can also become a bottleneck if not done efficiently. In this paper we address the problem of redistributing multi-dimensional array data in SPMD computations, the most prevalent form of parallelism in deep learning. We present a type-directed approach to synthesizing array redistributions as sequences of MPI-style collective operations. We prove formally that our synthesized redistributions are memory-efficient and perform no excessive data transfers. Array redistribution for SPMD computations using collective operations has also been implemented in the context of the XLA SPMD partitioner, a production-grade tool for partitioning programs across accelerator systems. We evaluate our approach against the XLA implementation and find that our approach delivers a geometric mean speedup of

1.22\times

, with maximum speedups as a high as

5.7\times

, while offering provable memory guarantees, making our system particularly appealing for large-scale models.Comment: minor errata fixe

arXiv.org e-Print Archive

BATS: Binary ArchitecTure Search

Author: A Paszke
C Liu
D Zhang
M Rastegari
N Ma
Publication venue
Publication date: 23/07/2020
Field of study

This paper proposes Binary ArchitecTure Search (BATS), a framework that drastically reduces the accuracy gap between binary neural networks and their real-valued counterparts by means of Neural Architecture Search (NAS). We show that directly applying NAS to the binary domain provides very poor results. To alleviate this, we describe, to our knowledge, for the first time, the 3 key ingredients for successfully applying NAS to the binary domain. Specifically, we (1) introduce and design a novel binary-oriented search space, (2) propose a new mechanism for controlling and stabilising the resulting searched topologies, (3) propose and validate a series of new search strategies for binary networks that lead to faster convergence and lower search times. Experimental results demonstrate the effectiveness of the proposed approach and the necessity of searching in the binary space directly. Moreover, (4) we set a new state-of-the-art for binary neural networks on CIFAR10, CIFAR100 and ImageNet datasets. Code will be made available https://github.com/1adrianb/binary-nasComment: accepted to ECCV 202

arXiv.org e-Print Archive

Crossref

Experimental comparison of features and classifiers for Android malware detection

Author: Barandiaran I.
Chen K.
Huang C.-Y.
Paszke A.
Witten I. H.
Yuan Z.
Zhang H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

National Research Foundation (NRF) Singapor

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Institutional Knowledge at Singapore Management University

Catalogo dei prodotti della ricerca

Metastable neural dynamics underlies cognitive performance across multiple behavioural paradigms

Author: Bassett D. S.
Castelli F.
Gabor D.
Kelso J. A. S.
Kucyi A.
Laumann T. O.
Panter P.
Paszke A.
Srivastava N.
Publication venue: 'Wiley'
Publication date: 15/08/2020
Field of study

Crossref

Ulster University's Research Portal

Measurements, Models, Systems and Design

Author: Adamski M. Węgrzyn, M. Węgrzyn, A.
Barkalov A. Titarenko, L.
Benysek G. Jarnut, M. Rusiński, J.
Fedyczak Z. Szcześniak, P. Kaniewski, J.
Furmankiewicz L. Kozioł, M. Kłosiński, R.
Gałkowski K. Paszke, W. Sulikowski, B.
Gielerak R. Kuriata, E. Sawerwain, M. Pawłowski, K.
Kempski A. Smoleński, R. Kot, E.
Korbicz J. Witczak, M. Patan, K. Janczak, A. Mrugalski, M.
Korotyeyev I. Kasperek, R.
Michta E. Markowski, A.
Miczulski W. szulim, R.
Nikiel S. Steć, P.
Obuchowicz A. Pieczyński, A. Kowal, M. Prętki, P
Olencki A. Szmytkiewicz, J. Urbański, K.
Popławski A. Zając, W.
Rybski R. Kaczmarek J. Lal-Jadziak, J.
Uciński D. Patan, M. Kuczewski, B.
Publication venue: Wydawnictwa Komunikacji i Łączności, Warszawa
Publication date: 01/01/2007
Field of study

531 s.

Zielonogórska Biblioteka Cyfrowa (Digital Library of Zielona Gora)

PartIR: Composing SPMD Partitioning Strategies for Machine Learning

Author: Alabed Sami
Belov Daniel
Chrzaszcz Bart
Franco Juliana
Grewe Dominik
Maclaurin Dougal
Molloy James
Natan Tom
Norman Tamara
Pan Xiaoyue
Paszke Adam
Rink Norman A.
Schaarschmidt Michael
Sitdikov Timur
Swietlik Agnieszka
Vytiniotis Dimitrios
Wee Joel
Publication venue
Publication date: 03/03/2024
Field of study

Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN partitioning system. PartIR is focused on an incremental approach to rewriting and is hardware-and-runtime agnostic. We present a simple but powerful API for composing sharding strategies and a simulator to validate them. The process is driven by high-level programmer-issued partitioning tactics, which can be both manual and automatic. Importantly, the tactics are specified separately from the model code, making them easy to change. We evaluate PartIR on several different models to demonstrate its predictability, expressibility, and ability to reach peak performance.

arXiv.org e-Print Archive

DAMO: Deep Agile Mask Optimization for Full Chip Scale

Author: Chen Q.
Chen T.
Gao J.-R.
Goodfellow I.
Isola P.
Jiang B.
Johnson J.
Kingma D. P.
Kuang J.
Li H.
Ma Y.
Mirza M.
Pang L.
Park J.-S.
Paszke A.
Radford A.
Ronneberger O.
Wang T.-C.
Xu B.
Yang H.
Yang H.
Yang H.
Ye W.
Zhong W.
Zhou Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/11/2020
Field of study

Continuous scaling of the VLSI system leaves a great challenge on manufacturing and optical proximity correction (OPC) is widely applied in conventional design flow for manufacturability optimization. Traditional techniques conducted OPC by leveraging a lithography model and suffered from prohibitive computational overhead, and mostly focused on optimizing a single clip without addressing how to tackle the full chip. In this paper, we present DAMO, a high performance and scalable deep learning-enabled OPC system for full chip scale. It is an end-to-end mask optimization paradigm which contains a Deep Lithography Simulator (DLS) for lithography modeling and a Deep Mask Generator (DMG) for mask pattern generation. Moreover, a novel layout splitting algorithm customized for DAMO is proposed to handle the full chip OPC problem. Extensive experiments show that DAMO outperforms the state-of-the-art OPC solutions in both academia and industrial commercial toolkit

arXiv.org e-Print Archive

Crossref

Predicting the Propagation of Acoustic Waves using Deep Convolutional Neural Networks

Author: Goldstein M. E.
Kingma D. P.
Malaspinas O.
Mathieu M.
Nair V.
Paszke A.
Ronneberger O.
Sorteberg W. E.
Tompson J.
Wang S.
Zhu W.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2020
Field of study

A novel approach for numerically propagating acoustic waves in two-dimensional quiescent media has been developed through a fully convolutional multi-scale neural network. This data-driven method managed to produce accurate results for long simulation times with a database of Lattice Boltzmann temporal simulations of propagating Gaussian Pulses, even in the case of initial conditions unseen during training time, such as the plane wave configuration or the two initial Gaussian pulses of opposed amplitudes. Two different choices of optimization objectives are compared, resulting in an improved prediction accuracy when adding the spatial gradient difference error to the traditional mean squared error loss function. Further accuracy gains are observed when performing an a posteriori correction on the neural network prediction based on the conservation of acoustic energy, indicating the benefit of including physical information in data-driven methods

Crossref

Open Archive Toulouse Archive Ouverte

HAL Descartes

Hal-Diderot

Constant Velocity Constraints for Self-Supervised Monocular Depth Estimation

Author: Aleotti Filippo
Babu V Madhu
Eigen David
Fu Huan
Geiger Andreas
Godard Clément
Godard Clément
He Kaiming
Klodt Maria
Kuznietsov Yevhen
Laina Iro
Li Bo
Li Ruihao
Max Jaderberg
Newcombe A.
P.
Paszke Adam
Ranjan Anurag
Yin Zhichao
Zhan Huangying
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/12/2020
Field of study

We present a new method for self-supervised monocular depth estimation. Contemporary monocular depth estimation methods use a triplet of consecutive video frames to estimate the central depth image. We make the assumption that the ego-centric view progresses linearly in the scene, based on the kinematic and physical properties of the camera. During the training phase, we can exploit this assumption to create a depth estimation for each image in the triplet. We then apply a new geometry constraint that supports novel synthetic views, thus providing a strong supervisory signal. Our contribution is simple to implement, requires no additional trainable parameter, and produces competitive results when compared with other state-of-the-art methods on the popular KITTI corpus

Crossref

University of East Anglia digital repository

On 2D integro-differential systems. Stability and sensitivity analysis

Author: A Bielecki
BV Limaye
D Bors
D Idczak
D Idczak
E Berkson
E Fornasini
E Fornasini
J Šremr
J-P Aubin
K Galkowski
Marek Majewski
Monika Bartkiewicz
S Walczak
S Walczak
Stanisław Walczak
T Kaczorek
V Lakshmikantham
V Lomadze
V Singh
W Paszke
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref