22 research outputs found

    Relative Positional Encoding for Speech Recognition and Direct Translation

    Full text link
    Transformer models are powerful sequence-to-sequence architectures that are capable of directly mapping speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text modeling, and thus is less ideal for acoustic inputs. In this work, we adapt the relative position encoding scheme to the Speech Transformer, where the key addition is relative distance between input states in the self-attention network. As a result, the network can better adapt to the variable distributions present in speech data. Our experiments show that our resulting model achieves the best recognition result on the Switchboard benchmark in the non-augmentation condition, and the best published result in the MuST-C speech translation benchmark. We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.Comment: Submitted to Interspeech 202

    KIT’s IWSLT 2020 SLT Translation System

    Get PDF
    This paper describes KIT’s submissions to the IWSLT2020 Speech Translation evaluation campaign. We first participate in the simultaneous translation task, in which our simultaneous models are Transformer based and can be efficiently trained to obtain low latency with minimized compromise in quality. On the offline speech translation task, we applied our new Speech Transformer architecture to end-to-end speech translation. The obtained model can provide translation quality which is competitive to a complicated cascade. The latter still has the upper hand, thanks to the ability to transparently access to the transcription, and resegment the inputs to avoid fragmentation

    KIT’s IWSLT 2021 Offline Speech Translation System

    Get PDF
    This paper describes KIT’submission to the IWSLT 2021 Offline Speech Translation Task. We describe a system in both cascaded condition and end-to-end condition. In the cascaded condition, we investigated different end-to-end architectures for the speech recognition module. For the text segmentation module, we trained a small transformer-based model on high-quality monolingual data. For the translation module, our last year’s neural machine translation model was reused. In the end-to-end condition, we improved our Speech Relative Transformer architecture to reach or even surpass the result of the cascade system

    Functional-Antioxidant Food

    Get PDF
    Nowadays, people face many different dangers, such as stress, unsafety food, and environmental pollution, but not everyone suffers. Meanwhile, free radicals are the biggest threat for humans because they lead to over 80 different diseases composed of aging. Free radicals can only be eliminated or minimized with antioxidant foods or antioxidants. The chapter on the functional-antioxidant food presents the antioxidant functional food concept, the classification, the structure, and the extraction process of antioxidant ingredients. Various antioxidant substances such as protein (collagen), polysaccharides (fucoidans, alginates, glucosamines, inulins, laminarins, ulvans, and pectins), and secondary metabolites (polyphenols (phlorotannins, lignins, polyphenols), alkaloids, and flavonoids) also present. The production technology, the mechanism, the opportunity, and the challenge of antioxidants functional food also present in the current chapter. The current chapter also gives the production process of functional-antioxidant food composed of the capsule, the tablet, tube, the pills, the powder, and the effervescent tablet

    The IWSLT 2019 KIT Speech Translation System

    No full text
    This paper describes KIT’s submission to the IWSLT 2019 Speech Translation task on two sub-tasks corresponding to two different datasets. We investigate different end-to-end architectures for the speech recognition module, including our new transformer-based architectures. Overall, our modules in the pipe-line are based on the transformer architecture which has recently achieved great results in various fields. In our systems, using transformer is also advantageous compared to traditional hybrid systems in term of simplicity while still having competent results
    corecore