Search CORE

9 research outputs found

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Author: Arik Sercan O.
Child Rewon
Coates Adam
Fougner Chris
Gibiansky Andrew
Hestness Joel
Kliegl Markus
Prenger Ryan
Publication venue
Publication date: 04/07/2017
Field of study

Keyword spotting (KWS) constitutes a major component of human-technology interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems, we combine the strengths of convolutional layers and recurrent layers to exploit local structure and long-range context. We analyze the effect of architecture parameters, and propose training strategies to improve performance. With only ~230k parameters, our CRNN model yields acceptably low latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise ratio.Comment: Accepted to Interspeech 201

arXiv.org e-Print Archive

Crossref

PaLM: Scaling Language Modeling with Pathways

Author: Agrawal Shivani
Austin Jacob
Barham Paul
Barnes Parker
Bosma Maarten
Bradbury James
Catasta Michele
Child Rewon
Chowdhery Aakanksha
Chung Hyung Won
Dai Andrew M.
Dean Jeff
Dev Sunipa
Devlin Jacob
Diaz Mark
Dohan David
Du Nan
Duke Toju
Eck Douglas
Fedus Liam
Fiedel Noah
Firat Orhan
Garcia Xavier
Gehrmann Sebastian
Ghemawat Sanjay
Gur-Ari Guy
Hutchinson Ben
Ippolito Daphne
Isard Michael
Lee Katherine
Levskaya Anselm
Lewkowycz Aitor
Lim Hyeontaek
Luan David
Maynez Joshua
Meier-Hellstern Kathy
Michalewski Henryk
Mishra Gaurav
Misra Vedant
Moreira Erica
Narang Sharan
Omernick Mark
Pellat Marie
Petrov Slav
Pillai Thanumalayan Sankaranarayana
Polozov Oleksandr
Pope Reiner
Prabhakaran Vinodkumar
Rao Abhishek
Reif Emily
Roberts Adam
Robinson Kevin
Saeta Brennan
Schuh Parker
Sepassi Ryan
Shazeer Noam
Shi Kensen
Spiridonov Alexander
Sutton Charles
Tay Yi
Tsvyashchenko Sasha
Wang Xuezhi
Wei Jason
Yin Pengcheng
Zhou Denny
Zhou Zongwei
Zoph Barret
Publication venue
Publication date: 19/04/2022
Field of study

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies

arXiv.org e-Print Archive

Theoretical Limitations of Self-Attention in Neural Sequence Models

Author: Barrington David A. Mix
Bengio Yoshua
Bernardy Jean-Philippe
Bo Cartling
Boppana Ravi B.
Chen Mia Xu
Cheng Jianpeng
Child Rewon
Chomsky Noam
Chomsky Noam
Clark Kevin
Dai Zihang
Dehghani Mostafa
Devlin Jacob
Everaert Martin B. H.
Furst Merrick
Gers Felix A.
Gibson Edward
Gopalan Parikshit
Grüning André
Gulordava Kristina
Hao Jie
Hastad Johan
Hsieh Yu-Lun
Kalinke Yvonne
Ke Tran
Kirov Christo
Korsky Samuel A.
Kuncoro Adhiguna
Lewis Richard L.
Lin Yongjie
Lin Zhouhan
Marvin Rebecca
McNaughton Robert
Merrill William
Miller George A.
Miller John
Mitzenmacher Michael
Palma Giacomo De
Parikh Ankur
Parker Dan
Paulus Romain
Pérez Jorge
Sennhauser Luzi
Shen Tao
Shen Tao
Shieber Stuart M.
Siegelman Hava
Skachkova Natalia
Suzgun Mirac
Tabor Whitney
Tenney Ian
Thomas McCoy R.
Vaswani Ashish
Voita Elena
Weiss Gail
Yang Baosong
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref