407 research outputs found
Programming Not Only by Example
In recent years, there has been tremendous progress in automated synthesis
techniques that are able to automatically generate code based on some intent
expressed by the programmer. A major challenge for the adoption of synthesis
remains in having the programmer communicate their intent. When the expressed
intent is coarse-grained (for example, restriction on the expected type of an
expression), the synthesizer often produces a long list of results for the
programmer to choose from, shifting the heavy-lifting to the user. An
alternative approach, successfully used in end-user synthesis is programming by
example (PBE), where the user leverages examples to interactively and
iteratively refine the intent. However, using only examples is not expressive
enough for programmers, who can observe the generated program and refine the
intent by directly relating to parts of the generated program.
We present a novel approach to interacting with a synthesizer using a
granular interaction model. Our approach employs a rich interaction model where
(i) the synthesizer decorates a candidate program with debug information that
assists in understanding the program and identifying good or bad parts, and
(ii) the user is allowed to provide feedback not only on the expected output of
a program, but also on the underlying program itself. That is, when the user
identifies a program as (partially) correct or incorrect, they can also
explicitly indicate the good or bad parts, to allow the synthesizer to accept
or discard parts of the program instead of discarding the program as a whole.
We show the value of our approach in a controlled user study. Our study shows
that participants have strong preference to using granular feedback instead of
examples, and are able to provide granular feedback much faster
Are item-level strategy shifts abrupt and collective? Age differences in cognitive skill acquisition
Item-level analysis allows for the examination of qualitative age and individual differences in skill acquisition, which are obscured when aggregating data across items. In the present study, item-level strategy shifts were generally gradual and variable, rather than abrupt and collective. Strategy shift reversions were frequent, and the total transition space was extensive, for both younger and older adults. Shift indices were highly variable between items for both younger and older adults. Age differences in item-level shift patterns suggest that older adults’ greater conservatism in strategy selection leads to more gradual strategy shift transitions for individual items as well as to more collective strategy shifts
Are Item-Level Strategy Shifts Abrupt and Collective? Age Differences in Cognitive Skill Acquisition
Item-level analysis allows for the examination of qualitative age and individual differences in skill
acquisition, which are obscured when aggregating data across items. In the present study, item-level strategy shifts were generally gradual and variable, rather than abrupt and collective. Strategy shift reversions were frequent, and the total transition space was extensive, for both younger and older adults. Shift indices were highly variable between items for both younger and older adults. Age differences in item-level shift patterns suggest that older adults’ greater conservatism in strategy selection leads to more gradual strategy shift transitions for individual items as well as to more collective strategy shifts
Root Mean Square Layer Normalization
Layer normalization (LayerNorm) has been successfully applied to various deep
neural networks to help stabilize training and boost model convergence because
of its capability in handling re-centering and re-scaling of both inputs and
weight matrix. However, the computational overhead introduced by LayerNorm
makes these improvements expensive and significantly slows the underlying
network, e.g. RNN in particular. In this paper, we hypothesize that
re-centering invariance in LayerNorm is dispensable and propose root mean
square layer normalization, or RMSNorm. RMSNorm regularizes the summed inputs
to a neuron in one layer according to root mean square (RMS), giving the model
re-scaling invariance property and implicit learning rate adaptation ability.
RMSNorm is computationally simpler and thus more efficient than LayerNorm. We
also present partial RMSNorm, or pRMSNorm where the RMS is estimated from p% of
the summed inputs without breaking the above properties. Extensive experiments
on several tasks using diverse network architectures show that RMSNorm achieves
comparable performance against LayerNorm but reduces the running time by 7%~64%
on different models. Source code is available at
https://github.com/bzhangGo/rmsnorm.Comment: NeurIPS 201
A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs
The proliferation of personal artificial intelligence (AI) -assistant technologies with speech-based conversational AI interfaces is driving the exponential growth in the consumer Internet of Things (IoT) market. As these technologies are being applied to keyword spotting (KWS), automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech (TTS) applications, it is of paramount importance that they provide uncompromising performance for context learning in long sequences, which is a key benefit of the attention mechanism, and that they work seamlessly in polyphonic environments. In this work, we present a 25-mm system-on-chip (SoC) in 16-nm FinFET technology, codenamed SM6, which executes end-to-end speech-enhancing attention-based ASR and NLP workloads. The SoC includes: 1) FlexASR, a highly reconfigurable NLP inference processor optimized for whole-model acceleration of bidirectional attention-based sequence-to-sequence (seq2seq) deep neural networks (DNNs); 2) a Markov random field source separation engine (MSSE), a probabilistic graphical model accelerator for unsupervised inference via Gibbs sampling, used for sound source separation; 3) a dual-core Arm Cortex A53 CPU cluster, which provides on-demand single Instruction/multiple data (SIMD) fast fourier transform (FFT) processing and performs various application logic (e.g., expectation–maximization (EM) algorithm and 8-bit floating-point (FP8) quantization); and 4) an always-on M0 subsystem for audio detection and power management. Measurement results demonstrate the efficiency ranges of 2.6–7.8 TFLOPs/W and 4.33–17.6 Gsamples/s/W for FlexASR and MSSE, respectively; MSSE denoising performance allowing 6 smaller ASR model to be stored on-chip with negligible accuracy loss; and 2.24-mJ energy consumption while achieving real-time throughput, end-to-end, and per-frame ASR latencies of 18 ms
- …