99 research outputs found
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
In spite of the excellent strides made by end-to-end (E2E) models in speech
recognition in recent years, named entity recognition is still challenging but
critical for semantic understanding. In order to enhance the ability to
recognize named entities in E2E models, previous studies mainly focus on
various rule-based or attention-based contextual biasing algorithms. However,
their performance might be sensitive to the biasing weight or degraded by
excessive attention to the named entity list, along with a risk of false
triggering. Inspired by the success of the class-based language model (LM) in
named entity recognition in conventional hybrid systems and the effective
decoupling of acoustic and linguistic information in the factorized neural
Transducer (FNT), we propose a novel E2E model to incorporate class-based LMs
into FNT, which is referred as C-FNT. In C-FNT, the language model score of
named entities can be associated with the name class instead of its surface
form. The experimental results show that our proposed C-FNT presents
significant error reduction in named entities without hurting performance in
general word recognition
Unified Normalization for Accelerating and Stabilizing Transformers
Solid results from Transformers have made them prevailing architectures in
various natural language and vision tasks. As a default component in
Transformers, Layer Normalization (LN) normalizes activations within each token
to boost the robustness. However, LN requires on-the-fly statistics calculation
in inference as well as division and square root operations, leading to
inefficiency on hardware. What is more, replacing LN with other
hardware-efficient normalization schemes (e.g., Batch Normalization) results in
inferior performance, even collapse in training. We find that this dilemma is
caused by abnormal behaviors of activation statistics, including large
fluctuations over iterations and extreme outliers across layers. To tackle
these issues, we propose Unified Normalization (UN), which can speed up the
inference by being fused with other linear operations and achieve comparable
performance on par with LN. UN strives to boost performance by calibrating the
activation and gradient statistics with a tailored fluctuation smoothing
strategy. Meanwhile, an adaptive outlier filtration strategy is applied to
avoid collapse in training whose effectiveness is theoretically proved and
experimentally verified in this paper. We demonstrate that UN can be an
efficient drop-in alternative to LN by conducting extensive experiments on
language and vision tasks. Besides, we evaluate the efficiency of our method on
GPU. Transformers equipped with UN enjoy about 31% inference speedup and nearly
18% memory reduction. Code will be released at
https://github.com/hikvision-research/Unified-Normalization.Comment: ACM MM'2
A re-assessment of the isosteric heat for CCl4 adsorption on graphite
We have carried out molecular simulations of carbon tetrachloride adsorption on graphite, in order to investigate the role of the octopole in potential models for the CCl/graphite system, and the temperature dependence of the first-order gas-liquid transition in the first adsorbate layer. Two classes of potential model for carbon tetrachloride were considered: the first has 5 LJ sites and the second includes five partial charges to model the leading octopole. Both models are adequate to represent the vapour-liquid equilibrium, suggesting that the octopole makes an insignificant contribution to the properties of the bulk phase. Both models show that adsorbed CCl molecules are delocalized on a graphite surface because of the strong intermolecular interactions. It is found that the LJ sites on the chlorine atoms, not the octopole, play the most important role in matching the experimental isotherm and isosteric heat data with simulation. The heat is constant, across the first-order transition of the first adsorbate layer. The simulation results show that both the magnitude of the density jump, and the isosteric heat across the first-order transition, decrease as the temperature increases. This is in qualitative agreement with the 1972 experimental data of Avgul and Kiselev, but these experimental data exhibit an unusually strong decrease in the isosteric heat, and the coexistence region between the two phases displays an unusual asymmetrical shape. Detailed analysis of our simulation results, together with the calculated isosteric heat from the experimental isotherms of Machin and Ross, show that there may be errors associated with the heat data of Avgul and Kiselev at high temperatures
- …