262 research outputs found
A Yaw Stability Control Algorithm for Four-Wheel Independently Actuated Electric Ground Vehicles considering Control Boundaries
A hierarchical control algorithm of direct yaw moment control for four-wheel independently actuated (FWIA) electric ground vehicles is presented. Sliding mode control is adopted to yield the desired yaw moment in the higher layer of the algorithm due to the possible modeling inaccuracies and parametric uncertainties. The conditional integrator approach is employed to overcome the chattering issue, which enables a smooth transition to a proportional + integral-like controller, with antiwindup, when the system is entering the boundary layer. The lower level of the algorithm is given to allocate the desired yaw moment to four wheels by means of slip ratio distribution and control for a better grasp of control boundaries. Simulation results, obtained with a vehicle dynamics simulator, Carsim, and the Matlab/Simulink, show the effectiveness of the control algorithm
Improving the Performance of Online Neural Transducer Models
Having a sequence-to-sequence model which can operate in an online fashion is
important for streaming applications such as Voice Search. Neural transducer is
a streaming sequence-to-sequence model, but has shown a significant degradation
in performance compared to non-streaming models such as Listen, Attend and
Spell (LAS). In this paper, we present various improvements to NT.
Specifically, we look at increasing the window over which NT computes
attention, mainly by looking backwards in time so the model still remains
online. In addition, we explore initializing a NT model from a LAS-trained
model so that it is guided with a better alignment. Finally, we explore
including stronger language models such as using wordpiece models, and applying
an external LM during the beam search. On a Voice Search task, we find with
these improvements we can get NT to match the performance of LAS
Carrying capacity analysis and optimizing of hydrostatic slider bearings under inertial force and vibration impact using finite difference method (FDM)
Accuracy of machine tool with a gantry frame is reduced due to the vibration caused by inertial force impact. Hydrostatic slider bearings are considered as key structures of machine tools, which play an important role in improving impact and vibration resistance ability as dynamic performance. In this work, an incline model which combines bending deformation with linear displacement is simulated using working conditions of the straddle carrier under inertial force impact as an imitation of vibration amplitude. Using finite difference method (FDM), numerical solution of pressure distribution in oil pad can be determined by solving the Reynolds equation. Relationship between carrying capability and incline extent can be determined by analyzing the resolution of the Reynolds equation. A new type of oil pad size optimizing process is propose in this work, which is carried out based on the analysis of the analyzing result to enhance the inertial force resistance ability. Finally, impact resistance capacity of machine tool can be improved by sacrificing oil film thickness
Experimental investigations of quasi-coherent micro-instabilities in Ohmic plasmas
The ITG and TEM instabilities with quasi-coherent spectra have been
identified experimentally, by the newly developed far-forward collective
scattering measurements in J-TEXT tokamak Ohmical plasmas. The ITG mode has
characteristic frequencies in the range of 30-100kHz and wavenumber of
k_\theta\rho_s<0.3. After the plasma density exceeds at critical value, the ITG
mode shows a bifurcation behavior, featured by frequency decrease and amplitude
enhancement. Meanwhile, the ion energy loss enhancement and confinement
degradation are also observed. It gives the direct experimental evidence for
ion thermal transport caused by ITG instability
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
For decades, context-dependent phonemes have been the dominant sub-word unit
for conventional acoustic modeling systems. This status quo has begun to be
challenged recently by end-to-end models which seek to combine acoustic,
pronunciation, and language model components into a single neural network. Such
systems, which typically predict graphemes or words, simplify the recognition
process since they remove the need for a separate expert-curated pronunciation
lexicon to map from phoneme-based units to words. However, there has been
little previous work comparing phoneme-based versus grapheme-based sub-word
units in the end-to-end modeling framework, to determine whether the gains from
such approaches are primarily due to the new probabilistic model, or from the
joint learning of the various components with grapheme-based units.
In this work, we conduct detailed experiments which are aimed at quantifying
the value of phoneme-based pronunciation lexica in the context of end-to-end
models. We examine phoneme-based end-to-end models, which are contrasted
against grapheme-based ones on a large vocabulary English Voice-search task,
where we find that graphemes do indeed outperform phonemes. We also compare
grapheme and phoneme-based approaches on a multi-dialect English task, which
once again confirm the superiority of graphemes, greatly simplifying the system
for recognizing multiple dialects
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Attention-based encoder-decoder architectures such as Listen, Attend, and
Spell (LAS), subsume the acoustic, pronunciation and language model components
of a traditional automatic speech recognition (ASR) system into a single neural
network. In previous work, we have shown that such architectures are comparable
to state-of-theart ASR systems on dictation tasks, but it was not clear if such
architectures would be practical for more challenging tasks such as voice
search. In this work, we explore a variety of structural and optimization
improvements to our LAS model which significantly improve performance. On the
structural side, we show that word piece models can be used instead of
graphemes. We also introduce a multi-head attention architecture, which offers
improvements over the commonly-used single-head attention. On the optimization
side, we explore synchronous training, scheduled sampling, label smoothing, and
minimum word error rate optimization, which are all shown to improve accuracy.
We present results with a unidirectional LSTM encoder for streaming
recognition. On a 12, 500 hour voice search task, we find that the proposed
changes improve the WER from 9.2% to 5.6%, while the best conventional system
achieves 6.7%; on a dictation task our model achieves a WER of 4.1% compared to
5% for the conventional system.Comment: ICASSP camera-ready versio
- …