Search CORE

23 research outputs found

Shared Microexponents: A Little Shifting Goes a Long Way

Author: Burger Doug
Chung Eric
Deng Zhaoxia
Elango Venmugil
Golub Maximilian
Hall Mathew
Klar Jasmine
Kolhe Gaurav
L'Heureux Renee
Melnick Levi
Melts Dimitry
Mesmakhosroshahi Maral
More Ankit
Naghshineh Sam
Naumov Maxim
Park Jongsoo
Perry Matt
Rouhani Bita
Shafipour Rasoul
Shao Lei
Varatkar Girish
Zhao Ritchie
Publication venue
Publication date: 15/02/2023
Field of study

This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems

arXiv.org e-Print Archive

Microscaling Data Formats for Deep Learning

Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe

arXiv.org e-Print Archive

Social Networks and Housing Markets

Author: Adam Guren
Albert-L�szl� Barab�si
Alexandra Marin
Alp Simsek
Annette Vissing-Jorgensen
Anthony Cookson
Baetschmann
Benjamin Enke
Benjamin Golub
Charles F Manski
Charles G Nathanson
Craig Burnside
Daron Acemoglu
David Berger
Edward L Glaeser
Edward M Miller
Emily Breza
Eric Gilbert
Ethan Cohen-Cole
Facebook
Gabriel Magno
George A Akerlof
George-Marios Angeletos
Greg Kaplan
Harrison Hong
J Harrison
James J Choi
James S Coleman
Jason J Jones
Jessica S Jeffers
Johan Ugander
Johannes Stroebel
John Beshears
John Geanakoplos
Jose A Scheinkman
Keith Hampton
Liran Einav
Luis Armona
Maeve Duggan
Maeve Duggan
Markus K Brunnermeier
Martina Morris
Matthew O Jackson
Matthew O Jackson
Matthew O Jackson
Maximilian Riedl
Mcpherson
Michael Bailey
Michael Bailey
Michael Bailey
Michael Bailey
Monika Piazzesi
Monika Piazzesi
Nicholas Barberis
Nicholas Barberis
Nicole B Ellison
Pablo Bra�as-Garza
Patrick Bayer
Peter M Demarzo
Ravi Kumar
Robert J Shiller
Robin Greenwood
Ruiqing Cao
Saptarshi Ghosh
Scott Baker
Scott L Feld
Shin
Theresa Kuchler
Theresa Kuchler
Theresa Kuchler
Tim Landvoigt
Timothy G Conley
Tina Kogov?ek
Todd Sinai
Ulrike Malmendier
Vivi Alatas
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Crossref

Community Structure and Market Outcomes: A Repeated Games in Networks Approach

Author: Alan P Kirman
Alvin E Roth
Andrea Galeotti
Andrea Galeotti
Antoni Calv�-Armengol
Avner Greif
Benjamin Golub
Bramoull�
Coralio Ballester
Craig Mcintosh
David A Miller
David Dollar
David Gamarnik
Dean Karlan
Dilip Abreu
Duncan J Watts
Francesco Nava
George Akerlof
Glenn Ellison
Han N Ozsoylev
Heski Bar-Isaac
Iftekhar A Chaudhury
Itay P Fainmesser
Itay P Fainmesser
Itay Perah Fainmesser
James Dow
Jim Engle-Warnick
John Mcmillan
Joshua Schwartzstein
Marcel Fafchamps
Margarida Corominas-Bosch
Mark S Granovetter
Markus Kinateder
Markus M Mobius
Matthew O Jackson
Matthew O Jackson
Matthew O Jackson
Maximilian Mihm
Michael S Chwe
Michihiro Kandori
Mihai Manea
Rachel E Kranton
Robert E Verrecchia
Robin Lee
Roland Pongou
Ronald S Burt
Sanjeev Goyal
Sanjeev Goyal
Simon Johnson
Ste�en Lippert
Truman F Bewley
Venkatesh Bala
Weisbuch Gerard
Wolfgang Hardle
Woodru� Christopher
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Crossref

DropBack : continuous pruning during deep neural network training

Author: Golub Maximilian
Publication venue: University of British Columbia Press
Publication date: 01/11/2018
Field of study

In recent years, neural networks have regained popularity in a variety of fields such as image recognition and speech transcription. As deep neural networks grow more popular for solving everyday tasks, deployment on small embedded devices — such as phones — is becoming increasingly popular. Moreover, many applications — such as face recognition or health applications — require personalization, which means that networks must be retrained after they have been deployed. Because today’s state-of-the-art networks are too large to fit on mobile devices and exceed mobile device power envelopes, techniques such as pruning and quantization have been developed to allow pre-trained networks to be shrunk by about an order of magnitude. However, they all assume that the network is first fully trained off-line on datacenter-class GPUs, then pruned in a post-processing step, and only then deployed to the mobile device. In this thesis, we introduce DropBack, a technique that significantly reduces the storage and computation required during both inference and training. In contrast to existing pruning schemes, which retain the weights with the largest values and set the rest to zero, DropBack identifies the weights that have changed the most, and recomputes the original initialization values for all other weights. This means that only the most important weights must be stored in off-chip memory both during inference and training, reducing off-chip memory accesses (responsible for a majority of the power usage) by up to 72×. Crucially, networks pruned using DropBack maintain high accuracy even for challenging network architectures: indeed, on modern, compact network architectures such as Densenet and WRN-28-10, DropBack outperforms the current state-of-the- art pruning techniques in both accuracy and off-chip memory storage required for weights. On the CIFAR-10 dataset, we observe 5× reduction in weights on an already 9×-reduced VGG-16 network, which we call VGG-S, and 4.5× on Densenet and WRN-28-10 — all with zero or negligible accuracy loss — or 19×, 27×, and 36×, respectively, with a minor impact on accuracy. When the recomputed initial weights are decayed to zero, the weight memory footprint of WRN-28-10 can be reduced up to 72×.Applied Science, Faculty ofElectrical and Computer Engineering, Department ofGraduat

University of British Columbia: cIRcle - UBC's Information Repository

Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models

Author: Dhillon Paramveer
Golub Maximilian
Li Jiaqi
Molaei Somayeh
Robert Lionel + "Jr"
Zheng Shaochun
Publication venue: CHI 2024
Publication date: 22/02/2024
Field of study

Advances in language modeling have paved the way for novel human-AI co-writing experiences. This paper explores how varying levels of scaffolding from large language models (LLMs) shape the co-writing process. Employing a within-subjects field experiment with a Latin square design, we asked participants (N=131) to respond to argumentative writing prompts under three randomly sequenced conditions: no AI assistance (control), next-sentence suggestions (low scaffolding), and next-paragraph suggestions (high scaffolding). Our findings reveal a U-shaped impact of scaffolding on writing quality and productivity (words/time). While low scaffolding did not significantly improve writing quality or productivity, high scaffolding led to significant improvements, especially benefiting non-regular writers and less tech-savvy users. No significant cognitive burden was observed while using the scaffolded writing tools, but a moderate decrease in text ownership and satisfaction was noted. Our results have broad implications for the design of AI-powered writing tools, including the need for personalized scaffolding mechanisms.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/192479/1/Dhillon et al. 2024 online.pdfDescription of Dhillon et al. 2024 online.pdf : Final VersionSEL

Deep Blue Documents at the University of Michigan

Terahertz Photogalvanics in Twisted Bilayer Graphene Close to the Second Magic Angle

Author: Efetov Dmitri K.
Ganichev Sergey D.
Golub Leonid E.
Hubmann Stefan
Kozlov Dmitry A.
Lu Xiaobo
Otteneder Maximilian
Taniguchi Takashi
Watanabe Kenji
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2020
Field of study

We report on the observation of photogalvanic effects in tBLG with a twist angle of 0.6 degrees. We show that excitation of the tBLG bulk causes a photocurrent, whose sign and magnitude are controlled by the orientation of the radiation electric field and the photon helicity. The observed photocurrent provides evidence for the reduction of the point group symmetry in low twist-angle tBLG to the lowest possible one. The developed theory shows that the current is formed by asymmetric scattering in gyrotropic tBLG. We also detected the photogalvanic current formed in the vicinity of the edges. For both bulk and edge photocurrents, we demonstrate the emergence of pronounced oscillations upon variation of the gate voltage. The gate voltages associated with the oscillations correlate with peaks in resistance measurements. These are well explained by interband transitions between a multitude of isolated bands in tBLG

arXiv.org e-Print Archive

University of Regensburg Publication Server

Terahertz spin ratchet effect in magnetic metamaterials

Author: Back Christian H.
Fuhrmann A.
Ganichev Sergey D.
Golub Leonid E.
Hild Marcel
Honda A.
Kato T.
Kobayashi T.
Kronseder Matthias
Matsubara M.
Oshima D.
Otteneder Maximilian
Wunderlich Jörg
Publication venue: arXiv.org
Publication date: 01/01/2023
Field of study

We report on spin ratchet currents driven by terahertz radiation electric fields in a Co/Pt magnetic metamaterial formed by triangle-shaped holes forming an antidots lattice and subjected to an external magnetic field applied perpendicularly to the metal film plane. We show that for a radiation wavelength substantially larger than the period of the antidots array the radiation causes a polarization-independent spin-polarized ratchet current. The current is generated by the periodic asymmetric radiation intensity distribution caused by the near-field diffraction at the edges of the antidots, which induces spatially inhomogeneous periodic electron gas heating, and a phase-shifted periodic asymmetric electrostatic force. The developed microscopic theory shows that the magnetization of the Co/Pt film results in a spin ratchet current caused by both the anomalous Hall and the anomalous Nernst effects. Additionally, we observed a polarization-dependent trigonal spin photocurrent, which is caused by the scattering of electrons at the antidot boundaries resulting in a spin-polarized current due to the magnetization. Microscopic theory of these effects reveals that the trigonal photocurrent is generated at the boundaries of the triangle antidots, whereas the spin ratchet is generated due to the spatially periodic temperature gradient over the whole film. This difference causes substantially different hysteresis widths of these two currents

University of Regensburg Publication Server

Probabilistic low-rank factorization accelerates tensor network simulations of critical quantum many-body ground states

Author: A. G. Pai
C. Musco
Dario Tamascelli
E. Anderson
Ferdinand Tschirsich
G. H. Golub
J. Cardy
J. Eisert
J. W. Demmel
Lucas Kohn
M. E. Wall
Martin B. Plenio
Maximilian Keck
Simone Montangero
T. Hastie
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Crossref