6,429 research outputs found

    The key role for local base order in the generation of multiple forms of China HIV-1 B'/C intersubtype recombinants

    Get PDF
    BACKGROUND: HIV-1 is a retrovirus with high rate of recombination. Increasing experimental studies in vitro indicated that local hairpin structure of RNA was associated with recombination by favoring RT pausing and promoting strand transfer. A method to estimate the potential to form stem-loop structure by calculating the folding of randomized sequence difference (FORS-D) has been used to investigate the relationship between secondary structure and evolutionary pressure in some genome. It showed that gene regions under strong positive "Darwinian" selection were associated with positive FORS-D values. In the present study, the sequences of HIV-1 subtypes B' and C, both of which represent the parent strains of CRF07_BC, CRF08_BC and China URFs, were selected to investigate the relationship between natural recombination and secondary structure by calculating the FORS-D values. RESULTS: The apparent higher negative FORS-D value region appeared in the gag-pol gene region (nucleotide 0–3000) of HIV-1 subtypes B' and C. Thirteen (86.7 %) of 15 mosaic fragments and 17 (81 %) of 21 recombination breakpoints occurred in this higher negative FORS-D region. This strongly suggested that natural recombination did not occur randomly throughout the HIV genome, and that there might be preferred (or hot) regions or sites for recombination. The FORS-D analysis of breakpoints showed that most breakpoints of recombinants were located in regions with higher negative FORS-D values (P = 0.0053), and appeared to have a higher negative average FORS-D value than the whole genome (P = 0.0007). The regression analysis also indicated that FORS-D values correlated negatively with breakpoint overlap. CONCLUSION: High negative FORS-D values represent high, base order determined stem-loop potentials and influence mainly the formation of stem-loop structures. Therefore, the present results suggested for the first time that occurrence of natural recombination was associated with high base order-determined stem-loop potential, and that local base order might play a key role in the initiation of natural recombination by favoring the formation of stable stem-loop structures

    Adaptive evolution of the spike gene of SARS coronavirus: changes in positively selected sites in different epidemic groups

    Get PDF
    BACKGROUND: It is believed that animal-to-human transmission of severe acute respiratory syndrome (SARS) coronavirus (CoV) is the cause of the SARS outbreak worldwide. The spike (S) protein is one of the best characterized proteins of SARS-CoV, which plays a key role in SARS-CoV overcoming species barrier and accomplishing interspecies transmission from animals to humans, suggesting that it may be the major target of selective pressure. However, the process of adaptive evolution of S protein and the exact positively selected sites associated with this process remain unknown. RESULTS: By investigating the adaptive evolution of S protein, we identified twelve amino acid sites (75, 239, 244, 311, 479, 609, 613, 743, 765, 778, 1148, and 1163) in the S protein under positive selective pressure. Based on phylogenetic tree and epidemiological investigation, SARS outbreak was divided into three epidemic groups: 02–04 interspecies, 03-early-mid, and 03-late epidemic groups in the present study. Positive selection was detected in the first two groups, which represent the course of SARS-CoV interspecies transmission and of viral adaptation to human host, respectively. In contrast, purifying selection was detected in 03-late group. These indicate that S protein experiences variable positive selective pressures before reaching stabilization. A total of 25 sites in 02–04 interspecies epidemic group and 16 sites in 03-early-mid epidemic group were identified under positive selection. The identified sites were different between these two groups except for site 239, which suggests that positively selected sites are changeable between groups. Moreover, it was showed that a larger proportion (24%) of positively selected sites was located in receptor-binding domain (RBD) than in heptad repeat (HR)1-HR2 region in 02–04 interspecies epidemic group (p = 0.0208), and a greater percentage (25%) of these sites occurred in HR1–HR2 region than in RBD in 03-early-mid epidemic group (p = 0.0721). These suggest that functionally different domains of S protein may not experience same positive selection in each epidemic group. In addition, three specific replacements (F360S, T487S and L665S) were only found between 03-human SARS-CoVs and strains from 02–04 interspecies epidemic group, which reveals that selective sweep may also force the evolution of S genes before the jump of SARS-CoVs into human hosts. Since certain residues at these positively selected sites are associated with receptor recognition and/or membrane fusion, they are likely to be the crucial residues for animal-to-human transmission of SARS-CoVs, and subsequent adaptation to human hosts. CONCLUSION: The variation of positive selective pressures and positively selected sites are likely to contribute to the adaptive evolution of S protein from animals to humans

    Light-Metal-Based Nanostructures for Energy and Biomedical Applications

    Get PDF
    International audienc

    Enhanced sequence labeling based on latent variable conditional random fields

    Get PDF
    Natural language processing is a useful processing technique of language data, such as text and speech. Sequence labeling represents the upstream task of many natural language processing tasks, such as machine translation, text classification, and sentiment classification. In this paper, the focus is on the sequence labeling task, in which semantic labels are assigned to each unit of a given input sequence. Two frameworks of latent variable conditional random fields (CRF) models (called LVCRF-I and LVCRF-II) are proposed, which use the encoding schema as a latent variable to capture the latent structure of the hidden variables and the observed data. Among the two designed models, the LVCRF-I model focuses on the sentence level, while the LVCRF-II works in the word level, to choose the best encoding schema for a given input sequence automatically without handcraft features. In the experiments, the two proposed models are verified by four sequence prediction tasks, including named entity recognition (NER), chunking, reference parsing and POS tagging. The proposed frameworks achieve better performance without using other handcraft features than the conventional CRF model. Moreover, these designed frameworks can be viewed as a substitution of the conventional CRF models. In the commonly used LSTM-CRF models, the CRF layer can be replaced with our proposed framework as they use the same training and inference procedure. The experimental results show that the proposed models exhibit latent variable and provide competitive and robust performance on all three sequence prediction tasks

    Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning

    Full text link
    Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe. Tomitigate this problem, we propose a novel multi-agent offline RL algorithm, named CounterFactual Conservative Q-Learning (CFCQL) to conduct conservative value estimation. Rather than regarding all the agents as a high dimensional single one and directly applying single agent methods to it, CFCQL calculates conservative regularization for each agent separately in a counterfactual way and then linearly combines them to realize an overall conservative value estimation. We prove that it still enjoys the underestimation property and the performance guarantee as those single agent conservative methods do, but the induced regularization and safe policy improvement bound are independent of the agent number, which is therefore theoretically superior to the direct treatment referred to above, especially when the agent number is large. We further conduct experiments on four environments including both discrete and continuous action settings on both existing and our man-made datasets, demonstrating that CFCQL outperforms existing methods on most datasets and even with a remarkable margin on some of them.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS 2023
    • …
    corecore