39 research outputs found

    Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation

    Full text link
    In this paper, we study the use of deep Transformer translation model for the CCMT 2022 Chinese-Thai low-resource machine translation task. We first explore the experiment settings (including the number of BPE merge operations, dropout probability, embedding size, etc.) for the low-resource scenario with the 6-layer Transformer. Considering that increasing the number of layers also increases the regularization on new model parameters (dropout modules are also introduced when using more layers), we adopt the highest performance setting but increase the depth of the Transformer to 24 layers to obtain improved translation quality. Our work obtains the SOTA performance in the Chinese-to-Thai translation in the constrained evaluation

    cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The completion of the <it>Plasmodium falciparum </it>genome represents a milestone in malaria research. The genome sequence allows for the development of genome-wide approaches such as microarray and proteomics that will greatly facilitate our understanding of the parasite biology and accelerate new drug and vaccine development. Designing and application of these genome-wide assays, however, requires accurate information on gene prediction and genome annotation. Unfortunately, the genes in the parasite genome databases were mostly identified using computer software that could make some erroneous predictions.</p> <p>Results</p> <p>We aimed to obtain cDNA sequences to examine the accuracy of gene prediction <it>in silico</it>. We constructed cDNA libraries from mixed blood stages of <it>P. falciparum </it>parasite using the SMART cDNA library construction technique and generated 17332 high-quality expressed sequence tags (EST), including 2198 from primer-walking experiments. Assembly of our sequence tags produced 2548 contigs and 2671 singletons <it>versus </it>5220 contigs and 5910 singletons when our EST were assembled with EST in public databases. Comparison of all the assembled EST/contigs with predicted CDS and genomic sequences in the PlasmoDB database identified 356 genes with predicted coding sequences fully covered by EST, including 85 genes (23.6%) with introns incorrectly predicted. Careful automatic software and manual alignments found an additional 308 genes that have introns different from those predicted, with 152 new introns discovered and 182 introns with sizes or locations different from those predicted. Alternative spliced and antisense transcripts were also detected. Matching cDNA to predicted genes also revealed silent chromosomal regions, mostly at subtelomere regions.</p> <p>Conclusion</p> <p>Our data indicated that approximately 24% of the genes in the current databases were predicted incorrectly, although some of these inaccuracies could represent alternatively spliced transcripts, and that more genes than currently predicted have one or more additional introns. It is therefore necessary to annotate the parasite genome with experimental data, although obtaining complete cDNA sequences from this parasite will be a formidable task due to the high AT nature of the genome. This study provides valuable information for genome annotation that will be critical for functional analyses.</p

    Detection of genome-wide polymorphisms in the AT-rich Plasmodium falciparum genome using a high-density microarray

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genetic mapping is a powerful method to identify mutations that cause drug resistance and other phenotypic changes in the human malaria parasite <it>Plasmodium falciparum</it>. For efficient mapping of a target gene, it is often necessary to genotype a large number of polymorphic markers. Currently, a community effort is underway to collect single nucleotide polymorphisms (SNP) from the parasite genome. Here we evaluate polymorphism detection accuracy of a high-density 'tiling' microarray with 2.56 million probes by comparing single feature polymorphisms (SFP) calls from the microarray with known SNP among parasite isolates.</p> <p>Results</p> <p>We found that probe GC content, SNP position in a probe, probe coverage, and signal ratio cutoff values were important factors for accurate detection of SFP in the parasite genome. We established a set of SFP calling parameters that could predict mSFP (SFP called by multiple overlapping probes) with high accuracy (≥ 94%) and identified 121,087 mSFP genome-wide from five parasite isolates including 40,354 unique mSFP (excluding those from multi-gene families) and ~18,000 new mSFP, producing a genetic map with an average of one unique mSFP per 570 bp. Genomic copy number variation (CNV) among the parasites was also cataloged and compared.</p> <p>Conclusion</p> <p>A large number of mSFP were discovered from the <it>P. falciparum </it>genome using a high-density microarray, most of which were in clusters of highly polymorphic genes at chromosome ends. Our method for accurate mSFP detection and the mSFP identified will greatly facilitate large-scale studies of genome variation in the <it>P. falciparum </it>parasite and provide useful resources for mapping important parasite traits.</p

    High recombination rates and hotspots in a Plasmodium falciparum genetic cross

    Get PDF
    Using the universal P2/P8 primers, we were able to obtain the gene segments of chromo-helicase-DNA binding protein (CHD)-Z and CHD-W from ten species of ardeid birds including Chinese egret (Egretta eulophotes), little egret (E. garzetta), eastern reef egret (E. sacra), great egret (Ardea alba), grey heron (A. cinerea), Chinese pond-heron (Ardeola bacchus), cattle egret (Bubulcus ibis), black-crowned night-heron (Nycticorax nycticorax), cinnamon bittern (Ixobrychus cinnamomeus) and yellow bittern (I. sinensis). Based on conserved regions inside the P2/P8-derived sequences, we designed new PCR primers for sex identification in these ardeid species. Using agarose gel electrophoresis, the PCR products showed two bands for females (140 bp derived from CHD-W and the other 250 bp from CHD-ZW), whereas the males showed only the 250 bp band. The results indicated that our new primers could be used for accurate and convenient sex identification in ardeid species.National Natural Science Foundation of China[30970380, 40876077]; Fujian Natural Science Foundation of China[2008S0007, 2009J01195

    Population Genetic Analysis of Plasmodium falciparum Parasites Using a Customized Illumina GoldenGate Genotyping Assay

    Get PDF
    The diversity in the Plasmodium falciparum genome can be used to explore parasite population dynamics, with practical applications to malaria control. The ability to identify the geographic origin and trace the migratory patterns of parasites with clinically important phenotypes such as drug resistance is particularly relevant. With increasing single-nucleotide polymorphism (SNP) discovery from ongoing Plasmodium genome sequencing projects, a demand for high SNP and sample throughput genotyping platforms for large-scale population genetic studies is required. Low parasitaemias and multiple clone infections present a number of challenges to genotyping P. falciparum. We addressed some of these issues using a custom 384-SNP Illumina GoldenGate assay on P. falciparum DNA from laboratory clones (long-term cultured adapted parasite clones), short-term cultured parasite isolates and clinical (non-cultured isolates) samples from East and West Africa, Southeast Asia and Oceania. Eighty percent of the SNPs (n = 306) produced reliable genotype calls on samples containing as little as 2 ng of total genomic DNA and on whole genome amplified DNA. Analysis of artificial mixtures of laboratory clones demonstrated high genotype calling specificity and moderate sensitivity to call minor frequency alleles. Clear resolution of geographically distinct populations was demonstrated using Principal Components Analysis (PCA), and global patterns of population genetic diversity were consistent with previous reports. These results validate the utility of the platform in performing population genetic studies of P. falciparum

    Genome-Wide Compensatory Changes Accompany Drug- Selected Mutations in the Plasmodium falciparum crt Gene

    Get PDF
    Mutations in PfCRT (Plasmodium falciparum chloroquine-resistant transporter), particularly the substitution at amino acid position 76, confer chloroquine (CQ) resistance in P. falciparum. Point mutations in the homolog of the mammalian multidrug resistance gene (pfmdr1) can also modulate the levels of CQ response. Moreover, parasites with the same pfcrt and pfmdr1 alleles exhibit a wide range of drug sensitivity, suggesting that additional genes contribute to levels of CQ resistance (CQR). Reemergence of CQ sensitive parasites after cessation of CQ use indicates that changes in PfCRT are deleterious to the parasite. Some CQR parasites, however, persist in the field and grow well in culture, which may reflect adaptive changes in the parasite genome to compensate for the mutations in PfCRT. Using three isogenic clones that have different drug resistance profiles corresponding to unique mutations in the pfcrt gene (106/1K76, 106/176I, and 106/76I-352K), we investigated changes in gene expression in these parasites grown with and without CQ. We also conducted hybridizations of genomic DNA to identify copy number (CN) changes in parasite genes. RNA transcript levels from 45 genes were significantly altered in one or both mutants relative to the parent line, 106/1K76. Most of the up-regulated genes are involved in invasion, cell growth and development, signal transduction, and transport activities. Of particular interest are genes encoding proteins involved in transport and/or regulation of cytoplasmic or compartmental pH such as the V-type H+ pumping pyrophosphatase 2 (PfVP2), Ca2+/H+ antiporter VCX1, a putative drug transporter and CN changes in pfmdr1. These changes may represent adaptations to altered functionality of PfCRT, a predicted member of drug/metabolite transporter superfamily found on the parasite food vacuole (FV) membrane. Further investigation of these genes may shed light on how the parasite compensates for functional changes accompanying drug resistance mutations in a gene coding for a membrane/drug transporter

    Interpolation algorithm considering simultaneous solution and instantaneous solution for power electronics electromagnetic transient simulation

    No full text
    In order to correctly simulate the simultaneous switching of power electronics circuits and to solve the problems of virtual power loss existing in the traditional interpolation algorithm, this article proposes interpolation algorithm considering simultaneous solution and instantaneous solution. After each integration, it searches for the switching events and determines the simultaneous switching. The simultaneous switching events are solved simultaneously. Instantaneous solution is carried out at forced commutation switching instant. The historical terms calculation method in instantaneous solution under different conditions is given, and the exact power loss of forced commutation switch is obtained. After processing all switching events during one time-step, the half-step interpolation is performed to eliminate numerical oscillations. The proposed algorithm is applied to the electromagnetic transient programme of advanced digital power system simulator (ADPSS), and the correctness and effectiveness of the algorithm are verified by simulation tests for typical power electronics circuits. The simulation results show that the proposed algorithm has high simulation accuracy and can satisfy requirements of power electronics simulation

    RRBP1 overexpression is associated with progression and prognosis in endometrial endometrioid adenocarcinoma

    No full text
    Abstract Background Currently, ribosome-binding protein 1 (RRBP1) is considered to be a novel oncogene that is overexpressed in colorectal cancer, lung cancer, mammary cancer, esophageal cancer and other carcinomas. However, the relationship between RRBP1 and endometrioid-type endometrial carcinoma (EC) remains unknown. Our purpose is to explore the function of RRBP1 in endometrioid-type endometrial carcinoma. Methods We investigated the expression of RRBP1 protein by immunohistochemistry on paraffin-embedded surgical specimens from one hundred thirty patients with endometrioid-type endometrial carcinoma. We also evaluated the differences in RRBP1 expression between endometrial cancer samples (n = 35) and normal endometrial tissues (n = 19) by western blotting. Results RRBP1 was more highly expressed in endometrial cancer samples than in normal samples (P < 0.05). High levels of expression of RRBP1 were strongly correlated with pathological features, such as the Federation of Gynecology and Obstetrics (FIGO) stage, histological grade, depth of myometrial invasion and lymph node metastasis (P < 0.05). Furthermore, RRBP1 expression was an independent prognostic factor for overall survival (OS) and disease-free survival (DFS) in patients with EC (both P < 0.05). Conclusion This experiment identifies the utility of RRBP1 in predicting EC prognosis, revealing that it may be a potential target for therapeutics of EC

    CSK Controls Retinoic Acid Receptor (RAR) Signaling: a RAR-c-SRC Signaling Axis Is Required for Neuritogenic Differentiationâ–¿

    No full text
    Herein, we report the first evidence that c-SRC is required for retinoic acid (RA) receptor (RAR) signaling, an observation that suggests a new paradigm for this family of nuclear hormone receptors. We observed that CSK negatively regulates RAR functions required for neuritogenic differentiation. CSK overexpression inhibited RA-mediated neurite outgrowth, a result which correlated with the inhibition of the SFK c-SRC. Consistent with an extranuclear effect of CSK on RAR signaling and neurite outgrowth, CSK overexpression blocked the downstream activation of RAC1. The conversion of GDP-RAC1 to GTP-RAC1 parallels the activation of c-SRC as early as 15 min following all-trans-retinoic acid treatment in LA-N-5 cells. The cytoplasmic colocalization of c-SRC and RARγ was confirmed by immunofluorescence staining and confocal microscopy. A direct and ligand-dependent binding of RAR with SRC was observed by surface plasmon resonance, and coimmunoprecipitation studies confirmed the in vivo binding of RARγ to c-SRC. Deletion of a proline-rich domain within RARγ abrogated this interaction in vivo. CSK blocked the RAR-RA-dependent activation of SRC and neurite outgrowth in LA-N-5 cells. The results suggest that transcriptional signaling events mediated by RA-RAR are necessary but not sufficient to mediate complex differentiation in neuronal cells. We have elucidated a nongenomic extranuclear signal mediated by the RAR-SRC interaction that is negatively regulated by CSK and is required for RA-induced neuronal differentiation
    corecore