992 research outputs found
Kosp2e: Korean Speech to English Translation Corpus
Most speech-to-text (S2T) translation studies use English speech as a source,
which makes it difficult for non-English speakers to take advantage of the S2T
technologies. For some languages, this problem was tackled through corpus
construction, but the farther linguistically from English or the more
under-resourced, this deficiency and underrepresentedness becomes more
significant. In this paper, we introduce kosp2e (read as `kospi'), a corpus
that allows Korean speech to be translated into English text in an end-to-end
manner. We adopt open license speech recognition corpus, translation corpus,
and spoken language corpora to make our dataset freely available to the public,
and check the performance through the pipeline and training-based approaches.
Using pipeline and various end-to-end schemes, we obtain the highest BLEU of
21.3 and 18.0 for each based on the English hypothesis, validating the
feasibility of our data. We plan to supplement annotations for other target
languages through community contributions in the future.Comment: Interspeech 2021 Camera-read
Investigating an Effective Character-level Embedding in Korean Sentence Classification
conference pape
Fully Unsupervised Training of Few-shot Keyword Spotting
For training a few-shot keyword spotting (FS-KWS) model, a large labeled
dataset containing massive target keywords has known to be essential to
generalize to arbitrary target keywords with only a few enrollment samples. To
alleviate the expensive data collection with labeling, in this paper, we
propose a novel FS-KWS system trained only on synthetic data. The proposed
system is based on metric learning enabling target keywords to be detected
using distance metrics. Exploiting the speech synthesis model that generates
speech with pseudo phonemes instead of texts, we easily obtain a large
collection of multi-view samples with the same semantics. These samples are
sufficient for training, considering metric learning does not intrinsically
necessitate labeled data. All of the components in our framework do not require
any supervision, making our method unsupervised. Experimental results on real
datasets show our proposed method is competitive even without any labeled and
real datasets.Comment: Accepted by IEEE SLT 202
EM-Network: Oracle Guided Self-distillation for Sequence Learning
We introduce EM-Network, a novel self-distillation approach that effectively
leverages target information for supervised sequence-to-sequence (seq2seq)
learning. In contrast to conventional methods, it is trained with oracle
guidance, which is derived from the target sequence. Since the oracle guidance
compactly represents the target-side context that can assist the sequence model
in solving the task, the EM-Network achieves a better prediction compared to
using only the source input. To allow the sequence model to inherit the
promising capability of the EM-Network, we propose a new self-distillation
strategy, where the original sequence model can benefit from the knowledge of
the EM-Network in a one-stage manner. We conduct comprehensive experiments on
two types of seq2seq models: connectionist temporal classification (CTC) for
speech recognition and attention-based encoder-decoder (AED) for machine
translation. Experimental results demonstrate that the EM-Network significantly
advances the current state-of-the-art approaches, improving over the best prior
work on speech recognition and establishing state-of-the-art performance on
WMT'14 and IWSLT'14.Comment: ICML 202
Proteomic and biochemical analyses reveal the activation of unfolded protein response, ERK-1/2 and ribosomal protein S6 signaling in experimental autoimmune myocarditis rat model
<p>Abstract</p> <p>Background</p> <p>To investigate the molecular and cellular pathogenesis underlying myocarditis, we used an experimental autoimmune myocarditis (EAM)-induced heart failure rat model that represents T cell mediated postinflammatory heart disorders.</p> <p>Results</p> <p>By performing unbiased 2-dimensional electrophoresis of protein extracts from control rat heart tissues and EAM rat heart tissues, followed by nano-HPLC-ESI-QIT-MS, 67 proteins were identified from 71 spots that exhibited significantly altered expression levels. The majority of up-regulated proteins were confidently associated with unfolded protein responses (UPR), while the majority of down-regulated proteins were involved with the generation of precursor metabolites and energy metabolism in mitochondria. Although there was no difference in AKT signaling between EAM rat heart tissues and control rat heart tissues, the amounts and activities of extracellular signal-regulated kinase (ERK)-1/2 and ribosomal protein S6 (rpS6) were significantly increased. By comparing our data with the previously reported myocardial proteome of the Coxsackie viruses of group B (CVB)-mediated myocarditis model, we found that UPR-related proteins were commonly up-regulated in two murine myocarditis models. Even though only two out of 29 down-regulated proteins in EAM rat heart tissues were also dysregulated in CVB-infected rat heart tissues, other proteins known to be involved with the generation of precursor metabolites and energy metabolism in mitochondria were also dysregulated in CVB-mediated myocarditis rat heart tissues, suggesting that impairment of mitochondrial functions may be a common underlying mechanism of the two murine myocarditis models.</p> <p>Conclusions</p> <p>UPR, ERK-1/2 and S6RP signaling were activated in both EAM- and CVB-induced myocarditis murine models. Thus, the conserved components of signaling pathways in two murine models of acute myocarditis could be targets for developing new therapeutic drugs or methods aimed at treating enigmatic myocarditis.</p
Biological Toxicity and Inflammatory Response of Semi-Single-Walled Carbon Nanotubes
The toxicological studies on carbon nanotubes (CNTs) have been urgently needed from the emerging diverse applications of CNTs. Physicochemical properties such as shape, diameter, conductance, surface charge and surface chemistry of CNTs gained during manufacturing processes play a key role in the toxicity. In this study, we separated the semi-conductive components of SWCNTs (semi-SWCNTs) and evaluated the toxicity on days 1, 7, 14 and 28 after intratracheal instillation in order to determine the role of conductance. Exposure to semi-SWCNTs significantly increased the growth of mice and significantly decreased the relative ratio of brain weight to body weight. Recruitment of monocytes into the bloodstream increased in a time-dependent manner, and significant hematological changes were observed 28 days after exposure. In the bronchoalveolar lavage (BAL) fluid, secretion of Th2-type cytokines, particularly IL-10, was more predominant than Th1-type cytokines, and expression of regulated on activation normal T cell expressed and secreted (RANTES), p53, transforming growth factor (TGF)-β, and inducible nitric oxide synthase (iNOS) increased in a time-dependent manner. Fibrotic histopathological changes peaked on day 7 and decreased 14 days after exposure. Expression of cyclooxygenase-2 (COX-2), mesothelin, and phosphorylated signal transducer and activator of transcription 3 (pSTAT3) also peaked on day 7, while that of TGF-β peaked on days 7 and 14. Secretion of histamine in BAL fluid decreased in a time-dependent manner. Consequently, we suggest that the brain is the target organ of semi-SWCNTs brought into the lung, and conductance as well as length may be critical factors affecting the intensity and duration of the inflammatory response following SWCNT exposure
- …