9,021 research outputs found

    Automatic Translating Between Ancient Chinese and Contemporary Chinese with Limited Aligned Corpora

    Full text link
    The Chinese language has evolved a lot during the long-term development. Therefore, native speakers now have trouble in reading sentences written in ancient Chinese. In this paper, we propose to build an end-to-end neural model to automatically translate between ancient and contemporary Chinese. However, the existing ancient-contemporary Chinese parallel corpora are not aligned at the sentence level and sentence-aligned corpora are limited, which makes it difficult to train the model. To build the sentence level parallel training data for the model, we propose an unsupervised algorithm that constructs sentence-aligned ancient-contemporary pairs by using the fact that the aligned sentence pair shares many of the tokens. Based on the aligned corpus, we propose an end-to-end neural model with copying mechanism and local attention to translate between ancient and contemporary Chinese. Experiments show that the proposed unsupervised algorithm achieves 99.4% F1 score for sentence alignment, and the translation model achieves 26.95 BLEU from ancient to contemporary, and 36.34 BLEU from contemporary to ancient.Comment: Acceptted by NLPCC 201

    Functional divergence of the rapidly evolving miR-513 subfamily in primates

    Get PDF
    BACKGROUND: The miR-513 subfamily belongs to an X-linked primate-specific miR506-514 cluster. Across primate species, there have been several duplication events and different species each possess a variety of miR-513 copies, indicating it underwent rapid evolution. Evidence suggests that this subfamily is preferentially expressed in the testis, but otherwise, to date, the evolutionary history and functional significance of this miRNA subfamily has remained largely unexplored. RESULTS: We analyzed the evolutionary pattern of gene duplications and their functional consequence for the miR-513 subfamily in primates. Sequence comparisons showed that the duplicated copies of miR-513 were derived from transposable element (MER91C). Moreover, duplication events of the miR-513 subfamily seem to have occurred independently in Platyrrhini (New World monkeys) and Catarrhini (Old World monkeys, apes and humans) after they diverged. Different copies of the miR-513 subfamily (miR-513a/b/c) have different seed sequences, due to after-duplication sequence divergences, which eventually led to functional divergences. The results of functional assays indicated that miR-513b could inhibit the expression of its target gene, the down-regulator of transcription 1 (DR1) at both the mRNA and protein levels. In the developing testis of rhesus macaques, we observed a temporal coupling of expression levels between miR-513b and DR1, suggesting that miR-513b could affect male sexual maturation by negatively regulating the development-stage related functioning of DR1. CONCLUSIONS: The miR-513 subfamily underwent multiple independent gene duplications among five different lineages of primates. The after-duplication sequence divergences among the different copies of miR-513 led to functional divergence of these copies in primates
    • …
    corecore