Search CORE

1,458 research outputs found

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Author: Fu Daniel Y.
Kumbong Hermann
Nguyen Eric
Ré Christopher
Publication venue
Publication date: 10/11/2023
Field of study

Convolution models with long filters have demonstrated state-of-the-art reasoning abilities in many long-sequence tasks but lag behind the most optimized Transformers in wall-clock time. A major bottleneck is the Fast Fourier Transform (FFT)--which allows long convolutions to run in

O(N logN)

time in sequence length

N

but has poor hardware utilization. In this paper, we study how to optimize the FFT convolution. We find two key bottlenecks: the FFT does not effectively use specialized matrix multiply units, and it incurs expensive I/O between layers of the memory hierarchy. In response, we propose FlashFFTConv. FlashFFTConv uses a matrix decomposition that computes the FFT using matrix multiply units and enables kernel fusion for long sequences, reducing I/O. We also present two sparse convolution algorithms--1) partial convolutions and 2) frequency-sparse convolutions--which can be implemented simply by skipping blocks in the matrix decomposition, enabling further opportunities for memory and compute savings. FlashFFTConv speeds up exact FFT convolutions by up to 7.93

\times

over PyTorch and achieves up to 4.4

\times

speedup end-to-end. Given the same compute budget, FlashFFTConv allows Hyena-GPT-s to achieve 2.3 points better perplexity on the PILE and M2-BERT-base to achieve 3.3 points higher GLUE score--matching models with twice the parameter count. FlashFFTConv also achieves 96.1% accuracy on Path-512, a high-resolution vision task where no model had previously achieved better than 50%. Furthermore, partial convolutions enable longer-sequence models--yielding the first DNA model that can process the longest human genes (2.3M base pairs)--and frequency-sparse convolutions speed up pretrained models while maintaining or improving model quality

arXiv.org e-Print Archive

Mapping the unconventional orbital texture in topological crystalline insulators

Author: A Gyenis
A. Bansil
BB Zhou
C-C Lee
Cheng-Yi Huang
D Hsieh
Daniel Walkup
F Krüger
Fangcheng Chou
Hsin Lin
I Lifshitz
Ilija Zeljkovic
J Dimmock
J Liu
J Seo
JE Hoffman
L Fu
L Fu
Liang Fu
M. Zahid Hasan
Maksym Serbyn
MJ Lawler
MP Allan
MZ Hasan
P Dziawa
R-J Slager
R. Sankar
S-Y Xu
TH Hsieh
Vidya Madhavan
Wei-Feng Tsai
Wenwen Zhou
X-L Qi
Y Cao
Y Okada
Y Okada
Y Okada
Y Tanaka
Y Tanaka
Y Tokura
Y Xia
YJ Wang
YL Chen
Yoshinori Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/12/2013
Field of study

The newly discovered topological crystalline insulators (TCIs) harbor a complex band structure involving multiple Dirac cones. These materials are potentially highly tunable by external electric field, temperature or strain and could find future applications in field-effect transistors, photodetectors, and nano-mechanical systems. Theoretically, it has been predicted that different Dirac cones, offset in energy and momentum-space, might harbor vastly different orbital character, a unique property which if experimentally realized, would present an ideal platform for accomplishing new spintronic devices. However, the orbital texture of the Dirac cones, which is of immense importance in determining a variety of materials properties, still remains elusive in TCIs. Here, we unveil the orbital texture in a prototypical TCI Pb

_{1-x}

_x

Se. By using Fourier-transform (FT) scanning tunneling spectroscopy (STS) we measure the interference patterns produced by the scattering of surface state electrons. We discover that the intensity and energy dependences of FTs show distinct characteristics, which can directly be attributed to orbital effects. Our experiments reveal the complex band topology involving two Lifshitz transitions and establish the orbital nature of the Dirac bands in this new class of topological materials, which could provide a different pathway towards future quantum applications

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Semi-Supervised Learning for Sparsely-Labeled Sequential Data: Application to Healthcare Video Processing

Author: Bhaskhar Nandita
Dubost Florian
Dunnmon Jared
Fu Daniel Y
Hong Erin
Lee-Messer Christopher
Rubin Daniel
Saab Khaled
Tang Siyi
Publication venue
Publication date: 25/03/2021
Field of study

Labeled data is a critical resource for training and evaluating machine learning models. However, many real-life datasets are only partially labeled. We propose a semi-supervised machine learning training strategy to improve event detection performance on sequential data, such as video recordings, when only sparse labels are available, such as event start times without their corresponding end times. Our method uses noisy guesses of the events' end times to train event detection models. Depending on how conservative these guesses are, mislabeled false positives may be introduced into the training set (i.e., negative sequences mislabeled as positives). We further propose a mathematical model for estimating how many inaccurate labels a model is exposed to, based on how noisy the end time guesses are. Finally, we show that neural networks can improve their detection performance by leveraging more training data with less conservative approximations despite the higher proportion of incorrect labels. We adapt sequential versions of MNIST and CIFAR-10 to empirically evaluate our method, and find that our risk-tolerant strategy outperforms conservative estimates by 12 points of mean average precision for MNIST, and 3.5 points for CIFAR. Then, we leverage the proposed training strategy to tackle a real-life application: processing continuous video recordings of epilepsy patients to improve seizure detection, and show that our method outperforms baseline labeling methods by 10 points of average precision

arXiv.org e-Print Archive

Singularity in the boundary resistance between superfluid $^4$ He and a solid surface

Author: A. Singsaas
B. I. Halperin
D. Frank
D. Frank
D. Frank
D. Murphy
Daniel Murphy
Edgar Genio
F. Zhong
Fengchuan Liu
G. Ahlers
G. Ahlers
G. Ahlers
G. Ahlers
Guenter Ahlers
H. Baddar
H. Baddar
H. Fu
H. Fu
H. Fu
Haiying Fu
Kerry Kuehn
L. D. Landau
M. N. Barber
R. V. Duncan
R. V. Duncan
Sarabjit Mehta
V. Dohm
V. Dohm
W. Y. Tam
Yuanming Liu
Publication venue: 'American Physical Society (APS)'
Publication date: 04/08/2001
Field of study

We report new measurements in four cells of the thermal boundary resistance

R

between copper and

^4

He below but near the superfluid-transition temperature

T_\lambda

. For

10^{-7} \leq t \equiv 1 - T/T_\lambda \leq 10^{-4}

fits of

R = R_0 t^{x_b} + B_0

to the data yielded

x_b \simeq 0.18

, whereas a fit to theoretical values based on the renormalization-group theory yielded

x_b = 0.23

. Alternatively, a good fit of the theory to the data could be obtained if the {\it amplitude} of the prediction was reduced by a factor close to two. The results raise the question whether the boundary conditions used in the theory should be modified.Comment: 4 pages, 4 figures, revte

arXiv.org e-Print Archive

Crossref

Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision

Author: Adila Dyah
Chen Mayee F.
Fatahalian Kayvon
Fu Daniel Y.
Ré Christopher
Sala Frederic
Zhang Michael
Publication venue
Publication date: 01/08/2022
Field of study

Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources -- pre-trained models, heuristics, crowd-workers -- to construct pseudolabels. The challenge is building a combination that best exploits the signal available in both foundation models and weak sources. We propose Liger, a combination that uses foundation model embeddings to improve two crucial elements of existing weak supervision techniques. First, we produce finer estimates of weak source quality by partitioning the embedding space and learning per-part source accuracies. Second, we improve source coverage by extending source votes in embedding space. Despite the black-box nature of foundation models, we prove results characterizing how our approach improves performance and show that lift scales with the smoothness of label distributions in embedding space. On six benchmark NLP and video tasks, Liger outperforms vanilla weak supervision by 14.1 points, weakly-supervised kNN and adapters by 11.8 points, and kNN and adapters supervised by traditional hand labels by 7.2 points.Comment: UAI 2022 Camera Read

arXiv.org e-Print Archive

Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning

Author: Chen Mayee F.
Fatahalian Kayvon
Fu Daniel Y.
Narayan Avanika
Ré Christopher
Song Zhao
Zhang Michael
Publication venue
Publication date: 15/04/2022
Field of study

An ideal learned representation should display transferability and robustness. Supervised contrastive learning (SupCon) is a promising method for training accurate models, but produces representations that do not capture these properties due to class collapse -- when all points in a class map to the same representation. Recent work suggests that "spreading out" these representations improves them, but the precise mechanism is poorly understood. We argue that creating spread alone is insufficient for better representations, since spread is invariant to permutations within classes. Instead, both the correct degree of spread and a mechanism for breaking this invariance are necessary. We first prove that adding a weighted class-conditional InfoNCE loss to SupCon controls the degree of spread. Next, we study three mechanisms to break permutation invariance: using a constrained encoder, adding a class-conditional autoencoder, and using data augmentation. We show that the latter two encourage clustering of latent subclasses under more realistic conditions than the former. Using these insights, we show that adding a properly-weighted class-conditional InfoNCE loss and a class-conditional autoencoder to SupCon achieves 11.1 points of lift on coarse-to-fine transfer across 5 standard datasets and 4.7 points on worst-group robustness on 3 datasets, setting state-of-the-art on CelebA by 11.5 points

arXiv.org e-Print Archive