22 research outputs found
Relative Positional Encoding for Speech Recognition and Direct Translation
Transformer models are powerful sequence-to-sequence architectures that are
capable of directly mapping speech inputs to transcriptions or translations.
However, the mechanism for modeling positions in this model was tailored for
text modeling, and thus is less ideal for acoustic inputs. In this work, we
adapt the relative position encoding scheme to the Speech Transformer, where
the key addition is relative distance between input states in the
self-attention network. As a result, the network can better adapt to the
variable distributions present in speech data. Our experiments show that our
resulting model achieves the best recognition result on the Switchboard
benchmark in the non-augmentation condition, and the best published result in
the MuST-C speech translation benchmark. We also show that this model is able
to better utilize synthetic data than the Transformer, and adapts better to
variable sentence segmentation quality for speech translation.Comment: Submitted to Interspeech 202
KIT’s IWSLT 2020 SLT Translation System
This paper describes KIT’s submissions to the IWSLT2020 Speech Translation evaluation campaign. We first participate in the simultaneous translation task, in which our simultaneous models are Transformer based and can be efficiently trained to obtain low latency with minimized compromise in quality. On the offline speech translation task, we applied our new Speech Transformer architecture to end-to-end speech translation. The obtained model can provide translation quality which is competitive to a complicated cascade. The latter still has the upper hand, thanks to the ability to transparently access to the transcription, and resegment the inputs to avoid fragmentation
KIT’s IWSLT 2021 Offline Speech Translation System
This paper describes KIT’submission to the IWSLT 2021 Offline Speech Translation Task. We describe a system in both cascaded condition and end-to-end condition. In the cascaded condition, we investigated different end-to-end architectures for the speech recognition module. For the text segmentation module, we trained a small transformer-based model on high-quality monolingual data. For the translation module, our last year’s neural machine translation model was reused. In the end-to-end condition, we improved our Speech Relative Transformer architecture to reach or even surpass the result of the cascade system
Functional-Antioxidant Food
Nowadays, people face many different dangers, such as stress, unsafety food, and environmental pollution, but not everyone suffers. Meanwhile, free radicals are the biggest threat for humans because they lead to over 80 different diseases composed of aging. Free radicals can only be eliminated or minimized with antioxidant foods or antioxidants. The chapter on the functional-antioxidant food presents the antioxidant functional food concept, the classification, the structure, and the extraction process of antioxidant ingredients. Various antioxidant substances such as protein (collagen), polysaccharides (fucoidans, alginates, glucosamines, inulins, laminarins, ulvans, and pectins), and secondary metabolites (polyphenols (phlorotannins, lignins, polyphenols), alkaloids, and flavonoids) also present. The production technology, the mechanism, the opportunity, and the challenge of antioxidants functional food also present in the current chapter. The current chapter also gives the production process of functional-antioxidant food composed of the capsule, the tablet, tube, the pills, the powder, and the effervescent tablet
The IWSLT 2019 KIT Speech Translation System
This paper describes KIT’s submission to the IWSLT 2019 Speech Translation task on two sub-tasks corresponding to two different datasets. We investigate different end-to-end architectures for the speech recognition module, including our new transformer-based architectures. Overall, our modules in the pipe-line are based on the transformer architecture which has recently achieved great results in various fields. In our systems, using transformer is also advantageous compared to traditional hybrid systems in term of simplicity while still having competent results