9 research outputs found
Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Keyword spotting (KWS) constitutes a major component of human-technology
interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate,
while minimizing the footprint size, latency and complexity are the goals for
KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks
(CRNNs). Inspired by large-scale state-of-the-art speech recognition systems,
we combine the strengths of convolutional layers and recurrent layers to
exploit local structure and long-range context. We analyze the effect of
architecture parameters, and propose training strategies to improve
performance. With only ~230k parameters, our CRNN model yields acceptably low
latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise
ratio.Comment: Accepted to Interspeech 201
PaLM: Scaling Language Modeling with Pathways
Large language models have been shown to achieve remarkable performance
across a variety of natural language tasks using few-shot learning, which
drastically reduces the number of task-specific training examples needed to
adapt the model to a particular application. To further our understanding of
the impact of scale on few-shot learning, we trained a 540-billion parameter,
densely activated, Transformer language model, which we call Pathways Language
Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML
system which enables highly efficient training across multiple TPU Pods. We
demonstrate continued benefits of scaling by achieving state-of-the-art
few-shot learning results on hundreds of language understanding and generation
benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough
performance, outperforming the finetuned state-of-the-art on a suite of
multi-step reasoning tasks, and outperforming average human performance on the
recently released BIG-bench benchmark. A significant number of BIG-bench tasks
showed discontinuous improvements from model scale, meaning that performance
steeply increased as we scaled to our largest model. PaLM also has strong
capabilities in multilingual tasks and source code generation, which we
demonstrate on a wide array of benchmarks. We additionally provide a
comprehensive analysis on bias and toxicity, and study the extent of training
data memorization with respect to model scale. Finally, we discuss the ethical
considerations related to large language models and discuss potential
mitigation strategies