173 research outputs found
KSZTAŁCENIE TŁUMACZY USTNYCH I PISEMNYCH W CHINACH: STAN OBECNY, WYZWANIA I ROZWIĄZANIA
Based on the Guideline for MTI Training Program established by China National Committee for MTI (Master of Translation and Interpreting) Education, this research makes a successive survey on the status quo of master education of legal translators and interpreters from 2014-2016 to trace the changes and problems revealed in the five major universities of political science and law in China. By comparing and analyzing the information and facts collected on the training target, curriculum setting, teaching staff, platform construction, practical training as well as the employment in those five universities, the authors sort out the diversified advantages and features of the five law schools, and probe into the existing problems and difficulties in common. On the grounds of the survey and interview conducted by the authors in the recent years, the authors put forward the solutions and suggestions on the improvement and future development of Chinese MTI education.Autorzy, w oparciu o Przewodnik Programu Kształcenia Tłumaczy Pisemnych i Ustnych ogłoszony przez Chiński Państwowy Komitet Kształcenia Tłumaczy Pisemnych i Ustnych, badają status studiów magisterskich w Chinach w zakresie kształcenia tłumaczy prawniczych w latach 2014-2016. Celem jest wychwycenie zmian i problemów, jakie zaistniały w ramach przedmiotu badania na pięciu największych chińskich uniwersytetach. Analizie poddano informacje i dane rzeczywiste zgromadzone na zajęciach kursowych a zawarte w programach studiów, dotyczące kadry, struktury platformy, zajęć praktycznych oraz informacje dotyczące zatrudnienia w badanych uniwersytetach. Autorzy artykułu skategoryzowali ww. informacje według zalet i charakterystyki określonych pięciu wydziałów prawa i podjęli próbę zdefiniowania ich wspólnych problemów. W oparciu o przeprowadzone badania autorzy artykułu wskazują określone sugestie, możliwości rozwiązania danych problemów oraz potencjalne ulepszenia, które sprzyjać będą przyszłemu rozwojowi kształcenia tłumaczy prawnych w Chinach
Improving BERT with Hybrid Pooling Network and Drop Mask
Transformer-based pre-trained language models, such as BERT, achieve great
success in various natural language understanding tasks. Prior research found
that BERT captures a rich hierarchy of linguistic information at different
layers. However, the vanilla BERT uses the same self-attention mechanism for
each layer to model the different contextual features. In this paper, we
propose a HybridBERT model which combines self-attention and pooling networks
to encode different contextual features in each layer. Additionally, we propose
a simple DropMask method to address the mismatch between pre-training and
fine-tuning caused by excessive use of special mask tokens during Masked
Language Modeling pre-training. Experiments show that HybridBERT outperforms
BERT in pre-training with lower loss, faster training speed (8% relative),
lower memory cost (13% relative), and also in transfer learning with 1.5%
relative higher accuracies on downstream tasks. Additionally, DropMask improves
accuracies of BERT on downstream tasks across various masking rates.Comment: 7 pages, 2 figure
Unbiased Delayed Feedback Label Correction for Conversion Rate Prediction
Conversion rate prediction is critical to many online applications such as
digital display advertising. To capture dynamic data distribution, industrial
systems often require retraining models on recent data daily or weekly.
However, the delay of conversion behavior usually leads to incorrect labeling,
which is called delayed feedback problem. Existing work may fail to introduce
the correct information about false negative samples due to data sparsity and
dynamic data distribution. To directly introduce the correct feedback label
information, we propose an Unbiased delayed feedback Label Correction framework
(ULC), which uses an auxiliary model to correct labels for observed negative
feedback samples. Firstly, we theoretically prove that the label-corrected loss
is an unbiased estimate of the oracle loss using true labels. Then, as there
are no ready training data for label correction, counterfactual labeling is
used to construct artificial training data. Furthermore, since counterfactual
labeling utilizes only partial training data, we design an embedding-based
alternative training method to enhance performance. Comparative experiments on
both public and private datasets and detailed analyses show that our proposed
approach effectively alleviates the delayed feedback problem and consistently
outperforms the previous state-of-the-art methods.Comment: accepted by KDD 202
Searching for the earliest use of limestone as a flux in Chinese high-fired ceramic glazes—evidence from Sr isotopic analysis of Chinese northern porcelain
Samples of northern porcelain wares dating to between the 6th and 13th centuries from the three most important northern Chinese ceramic kiln sites, Gongyi, Xing and Ding have been studied in this work. The Sr isotope and chemical compositions of the ceramic glazes of these wares have been determined. Based on the scientific results we have been able to suggest the raw materials used to make the glazes. Using Strontium isotopic analysis we have successfully shown that the earliest use of limestone as a glaze flux so far identified is during the period from the Sui to mid-Tang Dynasties (late 6th-early 9th century) to produce white slip glazed ware in the Xing kilns so it may have been ‘invented’ there
Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings
Prior studies diagnose the anisotropy problem in sentence representations
from pre-trained language models, e.g., BERT, without fine-tuning. Our analysis
reveals that the sentence embeddings from BERT suffer from a bias towards
uninformative words, limiting the performance in semantic textual similarity
(STS) tasks. To address this bias, we propose a simple and efficient
unsupervised approach, Diagonal Attention Pooling (Ditto), which weights words
with model-based importance estimations and computes the weighted average of
word representations from pre-trained models as sentence embeddings. Ditto can
be easily applied to any pre-trained language model as a postprocessing
operation. Compared to prior sentence embedding approaches, Ditto does not add
parameters nor requires any learning. Empirical evaluations demonstrate that
our proposed Ditto can alleviate the anisotropy problem and improve various
pre-trained models on STS tasks.Comment: 8 pages, accepted by EMNLP 2023 short paper, the source code can be
found at https://github.com/alibaba-damo-academy/SpokenNLP/tree/main/ditt
Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR
Recently, unified speech-text models, such as SpeechGPT, VioLA, and
AudioPaLM, have achieved remarkable performance on various speech tasks. These
models discretize speech signals into tokens (speech discretization) and use a
shared vocabulary for both text and speech tokens. Then they train a single
decoder-only Transformer on a mixture of speech tasks. However, these models
rely on the Loss Masking strategy for the ASR task, which ignores the
dependency among speech tokens. In this paper, we propose to model speech
tokens in an autoregressive way, similar to text. We find that applying the
conventional cross-entropy loss on input speech tokens does not consistently
improve the ASR performance over the Loss Masking approach. To address this
issue, we propose a novel approach denoted Smoothed Label Distillation (SLD),
which applies a KL divergence loss with smoothed labels on speech tokens. Our
experiments show that SLD effectively models speech tokens and outperforms Loss
Masking for decoder-only Transformers in ASR tasks with different speech
discretization methods. The source code can be found here:
https://github.com/alibaba-damo-academy/SpokenNLP/tree/main/sldComment: 5 pages, accepted by ICASSP 202
- …