8 research outputs found

    Tfold: efficient in silico prediction of non-coding RNA secondary structures

    Get PDF
    Predicting RNA secondary structures is a very important task, and continues to be a challenging problem, even though several methods and algorithms are proposed in the literature. In this article, we propose an algorithm called Tfold, for predicting non-coding RNA secondary structures. Tfold takes as input a RNA sequence for which the secondary structure is searched and a set of aligned homologous sequences. It combines criteria of stability, conservation and covariation in order to search for stems and pseudoknots (whatever their type). Stems are searched recursively, from the most to the least stable. Tfold uses an algorithm called SSCA for selecting the most appropriate sequences from a large set of homologous sequences (taken from a database for example) to use for the prediction. Tfold can take into account one or several stems considered by the user as belonging to the secondary structure. Tfold can return several structures (if requested by the user) when ‘rival’ stems are found. Tfold has a complexity of O(n2), with n the sequence length. The developed software, which offers several different uses, is available on the web site: http://tfold.ibisc.univ-evry.fr/TFold

    VarGoats project : a dataset of 1159 whole-genome sequences to dissect Capra hircus global diversity

    Get PDF
    Since their domestication 10,500 years ago, goat populations with distinctive genetic backgrounds have adapted to a broad variety of environments and breeding conditions. The VarGoats project is an international 1000-genome resequencing program designed to understand the consequences of domestication and breeding on the genetic diversity of domestic goats and to elucidate how speciation and hybridization have modeled the genomes of a set of species representative of the genus Capra. A dataset comprising 652 sequenced goats and 507 public goat sequences, including 35 animals representing eight wild species, has been collected worldwide. We identified 74,274,427 single nucleotide polymorphisms (SNPs) and 13,607,850 insertion-deletions (InDels) by aligning these sequences to the latest version of the goat reference genome (ARS1). A Neighbor-joining tree based on Reynolds genetic distances showed that goats from Africa, Asia and Europe tend to group into independent clusters. Because goat breeds from Oceania and Caribbean (Creole) all derive from imported animals, they are distributed along the tree according to their ancestral geographic origin. We report on an unprecedented international effort to characterize the genome-wide diversity of domestic goats. This large range of sequenced individuals represents a unique opportunity to ascertain how the demographic and selection processes associated with post-domestication history have shaped the diversity of this species. Data generated for the project will also be extremely useful to identify deleterious mutations and polymorphisms with causal effects on complex traits, and thus will contribute to new knowledge that could be used in genomic prediction and genome-wide association studies

    Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.</p> <p>Results</p> <p>This paper describes an algorithm, <it>SSCA</it>, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the <it>SSCA </it>algorithm for predicting the secondary structure of several RNAs. <it>SSCA </it>enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.</p> <p>Conclusion</p> <p><it>SSCA </it>is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.</p

    Algorithmes pour la prédiction de structures secondaires d'ARN

    No full text
    La connaissance de la structure secondaire des ARN est importante pour comprendre les relations entre structure et fonction des ARN. Elle est composée d'un ensemble d'hélices constituées de paires de bases complémentaires. Les algorithmes existants ont des complexités d'au moins O(n3). Cette thèse présente un algorithme, appelé P-DCfold, basé sur l'approche comparative pour la prédiction de structures secondaires des ARN avec une complexité en O(n2). Les hélices y sont recherchées récursivement en utilisant l'approche "diviser pour régner". La sélection des hélices est basée sur des critères thermodynamiques et de covariation. Le problème principal de l'approche comparative est la mauvaise qualité des alignements utilisés. P-DCfold utilise donc des modèles d'évolution sous contraintes de structure pour sélectionner les séquences correctement alignées. P-DCfold a prédit la structure secondaire de quelques ARN avec une sensibilité de 0,85 et une sélectivité de 0,95.The knowledge of RNA secondary structure is important to understand the relation between structure and function of the RNA. It is made up of a set of helices resulting from the folding of succession of a complementary base pairs. Complexities of existing algorithms is at least of O(n3). This thesis presents an algorithm, called P-DCFold, based on the comparative approach, for the prediction of RNA secondary structures with a complexity of O(n2). In this algorithm, helices are searched recursively using the "divide and conquer" approach. The selection of helices is based on thermodynamic and covariation criteria. The main problem of the comparative approach is the low quality of used alignment. So, P-DCfold use evolutionary models under structure constraints to select correctly aligned sequences. P-DCFold predicts the secondary structure of several RNA with a sensitivity of 0,85 and a sensibility of 0,95.EVRY-BU (912282101) / SudocSudocFranceF

    Genomic reconstruction of the successful establishment of a feralized bovine population on the subantarctic island of Amsterdam

    No full text
    International audienceThe feral cattle of the subantarctic island of Amsterdam provide an outstanding case study of a large mammalian population that was established by a handful of founders and thrived within a few generations in a seemingly inhospitable environment. Here, we investigated the genetic history and composition of this population using genotyping and sequencing data. Our inference showed an intense but brief founding bottleneck around the late 19th century and revealed contributions from European taurine and Indian Ocean zebu in the founder ancestry. Comparative analysis of whole genome sequences further revealed a moderate reduction in genetic diversity despite high levels of inbreeding. The brief and intense bottleneck was associated with high levels of drift, a flattening of the site frequency spectrum and a slight relaxation of purifying selection on mildly deleterious variants. Unlike some populations that have experienced prolonged reductions in effective population size, we did not observe any significant purging of highly deleterious variants.Interestingly, the population's success in the harsh environment can be attributed to pre-adaptation from their European taurine ancestry, suggesting no strong bioclimatic challenge, and also contradicting evidence for insular dwarfism. Genome scan for footprints of selection uncovered a majority of candidate genes related to nervous system function, likely reflecting rapid feralization driven by behavioral changes and complex social restructuring. The Amsterdam Island cattle offers valuable insights into rapid population establishment, feralization, and genetic adaptation in challenging environments. It also sheds light on the unique genetic legacies of feral populations, raising ethical questions according to conservation efforts

    Nanopore adaptive sampling to identify the NLR-gene family in melon (Cucumis melo L.)

    No full text
    International audienceNanopore Adaptive Sampling (NAS) offers a promising approach for assessing genetic diversity in targeted genomic regions. Herein, we design and validate an experiment to enrich a set of resistance genes in several melon cultivars as a proof of concept.We showed that each of the 15 regions we identified in two newly assembled melon genomes (subspecies melo) were successfully and accurately reconstructed as well as in a third cultivar from the agrestis subspecies. We obtained a fourfold enrichment, independently from the samples, but with some variations according to the enriched regions. In the agrestis cultivar, we further confirmed our assembly by PCR. We discussed parameters that can influence enrichment and accuracy of assemblies generated through NAS.Altogether, we demonstrated NAS as a simple and efficient approach to explore complex genomic regions. This approach finally unlocks the characterization of resistance genes for a large number of individuals, as required for breeding new cultivars responding to the agroecological transition

    VarGoats project: a dataset of 1159 whole-genome sequences to dissect Capra hircus global diversity

    No full text
    International audienceBackground Since their domestication 10,500 years ago, goat populations with distinctive genetic backgrounds have adapted to a broad variety of environments and breeding conditions. The VarGoats project is an international 1000-genome resequencing program designed to understand the consequences of domestication and breeding on the genetic diversity of domestic goats and to elucidate how speciation and hybridization have modeled the genomes of a set of species representative of the genus Capra . Findings A dataset comprising 652 sequenced goats and 507 public goat sequences, including 35 animals representing eight wild species, has been collected worldwide. We identified 74,274,427 single nucleotide polymorphisms (SNPs) and 13,607,850 insertion-deletions (InDels) by aligning these sequences to the latest version of the goat reference genome (ARS1). A Neighbor-joining tree based on Reynolds genetic distances showed that goats from Africa, Asia and Europe tend to group into independent clusters. Because goat breeds from Oceania and Caribbean (Creole) all derive from imported animals, they are distributed along the tree according to their ancestral geographic origin. Conclusions We report on an unprecedented international effort to characterize the genome-wide diversity of domestic goats. This large range of sequenced individuals represents a unique opportunity to ascertain how the demographic and selection processes associated with post-domestication history have shaped the diversity of this species. Data generated for the project will also be extremely useful to identify deleterious mutations and polymorphisms with causal effects on complex traits, and thus will contribute to new knowledge that could be used in genomic prediction and genome-wide association studies
    corecore