68 research outputs found

    TomExpress, a unified tomato RNA-Seq platform for visualization of expression data, clustering and correlation networks

    Get PDF
    The TomExpress platform was developed to provide the tomato research community with a browser and integrated web tools for public RNA-Seq data visualization and data mining. To avoid major biases that can result from the use of different mapping and statistical processing methods, RNA-Seq raw sequence data available in public databases were mapped de novo on a unique tomato reference genome sequence and post-processed using the same pipeline with accurate parameters. Following the calculation of the number of counts per gene in each RNA-Seq sample, a communal global normalization method was applied to all expression values. This unifies the whole set of expression data and makes them comparable. A database was designed where each expression value is associated with corresponding experimental annotations. Sample details were manually curated to be easily understandable by biologists. To make the data easily searchable, a user-friendly web interface was developed that provides versatile data mining web tools via on-the-fly generation of output graphics, such as expression bar plots, comprehensive in planta representations and heatmaps of hierarchically clustered expression data. In addition, it allows for the identification of co-expressed genes and the visualization of correlation networks of co-regulated gene groups. TomExpress provides one of the most complete free resources of publicly available tomato RNA-Seq data, and allows for the immediate interrogation of transcriptional programs that regulate vegetative and reproductive development in tomato under diverse conditions. The design of the pipeline developed in this project enables easy updating of the database with newly published RNA-Seq data, thereby allowing for continuous enrichment of the resource

    Interaction of two MADS-box genes leads to growth phenotype divergence of all-flesh type of tomatoes

    Get PDF
    [EN] All-flesh tomato cultivars are devoid of locular gel and exhibit enhanced firmness and improved postharvest storage. Here, we show that SlMBP3 is a master regulator of locular tissue in tomato fruit and that a deletion at the gene locus underpins the All-flesh trait. Intriguingly, All-flesh varieties lack the deleterious phenotypes reported previously for SlMBP3 under-expressing lines and which preclude any potential commercial use. We resolve the causal factor for this phenotypic divergence through the discovery of a natural mutation at the SlAGL11 locus, a close homolog of SlMBP3. Misexpressing SlMBP3 impairs locular gel formation through massive transcriptomic reprogramming at initial phases of fruit development. SlMBP3 influences locule gel formation by controlling cell cycle and cell expansion genes, indicating that important components of fruit softening are determined at early pre-ripening stages. Our findings define potential breeding targets for improved texture in tomato and possibly other fleshy fruits. The all-flesh type of tomato fruits is caused by mutation of the MBP3 gene, however, knocking down MBP3 in certain genotypes also affect plant and fruit development. Here, the authors show that a natural mutation of AGL11, a close homolog of MBP3, is responsible for the phenotypic divergence.The authors are grateful to L. Lemonnier and D. Saint-Martin for transformation and cultivation of tomato plants and GeT-PlaGe core facility (INRAe Toulouse) for ChIP deep sequencing. The authors also want to thank Dr. Christian Chevalier (INRAE et Univsersite de Bordeaux) for helping in analyzing genes related to cell cycle, cell division, and endoreduplication in tomato. This research was supported by the EU H2020 TomGEM 679796 and HARNESSTOM 101000716 projects.Huang, B.; Hu, G.; Wang, K.; Frasse, P.; Maza, E.; Djari, A.; Deng, W.... (2021). Interaction of two MADS-box genes leads to growth phenotype divergence of all-flesh type of tomatoes. Nature Communications. 12(1):1-14. https://doi.org/10.1038/s41467-021-27117-711412

    Novel Insights into the Bovine Polled Phenotype and Horn Ontogenesis in Bovidae

    Get PDF
    Despite massive research efforts, the molecular etiology of bovine polledness and the developmental pathways involved in horn ontogenesis are still poorly understood. In a recent article, we provided evidence for the existence of at least two different alleles at the Polled locus and identified candidate mutations for each of them. None of these mutations was located in known coding or regulatory regions, thus adding to the complexity of understanding the molecular basis of polledness. We confirm previous results here and exhaustively identify the causative mutation for the Celtic allele (PC) and four candidate mutations for the Friesian allele (PF). We describe a previously unreported eyelash-and-eyelid phenotype associated with regular polledness, and present unique histological and gene expression data on bovine horn bud differentiation in fetuses affected by three different horn defect syndromes, as well as in wild-type controls. We propose the ectopic expression of a lincRNA in PC/p horn buds as a probable cause of horn bud agenesis. In addition, we provide evidence for an involvement of OLIG2, FOXL2 and RXFP2 in horn bud differentiation, and draw a first link between bovine, ovine and caprine Polled loci. Our results represent a first and important step in understanding the genetic pathways and key process involved in horn bud differentiation in Bovidae

    The BioMart community portal: an innovative alternative to large, centralized data repositories.

    Get PDF
    The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations

    Compacting and correcting Trinity and Oases RNA-Seq de novo assemblies

    No full text
    De novo transcriptome assembly of short reads is now a common step in expression analysis of organisms lacking a reference genome sequence. Several software packages are available to perform this task. Even if their results are of good quality it is still possible to improve them in several ways including redundancy reduction or error correction. Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved. We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1.3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts. Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an easy to use software package to produce compact and corrected transcript set

    DRAP: de novo RNA-Seq Assembly Pipeline

    No full text
    Trinity and Oases are two commonly used de novo transcriptome assemblers. The contig sets they produce are of good quality. Still, their compaction (number of contigs needed to represent the transcriptome) and their quality (chimera and nucleotide error rates) can be improved. We built a de novo RNA-Seq Assembly Pipeline (DRAP) which wraps these two assemblers (Trinity and Oases) in order to improve their results regarding the above-mentioned criteria. DRAP reduces from 1.3 to 15 fold the number of resulting contigs of the assemblies depending on the read set and the assembler used. This article presents seven assembly comparisons showing in some cases drastic improvements when using DRAP. DRAP does not significantly impair assembly quality metrics such are read realignment rate or protein reconstruction counts. Transcriptome assembly is a challenging computational task even if good solutions are already available to end-users, these solutions can still be improved while conserving the overall representation and quality of the assembly. The de novo RNA-Seq Assembly Pipeline (DRAP) is an easy to use software package to produce compact and corrected transcript set

    RNAbrowse: RNA-Seq de novo assembly results browser.

    No full text
    Transcriptome analysis based on a de novo assembly of next generation RNA sequences is now performed routinely in many laboratories. The generated results, including contig sequences, quantification figures, functional annotations and variation discovery outputs are usually bulky and quite diverse. This article presents a user oriented storage and visualisation environment permitting to explore the data in a top-down manner, going from general graphical views to all possible details. The software package is based on biomart, easy to install and populate with local data. The software package is available under the GNU General Public License (GPL) at http://bioinfo.genotoul.fr/RNAbrowse

    Identification of large intergenic non-coding RNAs in bovine muscle using next-generation transcriptomic sequencing.

    Get PDF
    International audienceBACKGROUND: The advent of large-scale gene expression technologies has helped to reveal in eukaryotic cells, the existence of thousands of non-coding transcripts, whose function and significance remain mostly poorly understood. Among these non-coding transcripts, long non-coding RNAs (lncRNAs) are the least well-studied but are emerging as key regulators of diverse cellular processes. In the present study, we performed a survey in bovine Longissimus thoraci of lincRNAs (long intergenic non-coding RNAs not overlapping protein-coding transcripts). To our knowledge, this represents the first such study in bovine muscle. RESULTS: To identify lincRNAs, we used paired-end RNA sequencing (RNA-Seq) to explore the transcriptomes of Longissimus thoraci from nine Limousin bull calves. Approximately 14-45 million paired-end reads were obtained per library. A total of 30,548 different transcripts were identified. Using a computational pipeline, we defined a stringent set of 584 different lincRNAs with 418 lincRNAs found in all nine muscle samples. Bovine lincRNAs share characteristics seen in their mammalian counterparts: relatively short transcript and gene lengths, low exon number and significantly lower expression, compared to protein-encoding genes. As for the first time, our study identified lincRNAs from nine different samples from the same tissue, it is possible to analyse the inter-individual variability of the gene expression level of the identified lincRNAs. Interestingly, there was a significant difference when we compared the expression variation of the 418 lincRNAs with the 10,775 known selected protein-encoding genes found in all muscle samples. In addition, we found 2,083 pairs of lincRNA/protein-encoding genes showing a highly significant correlated expression. Fourteen lincRNAs were selected and 13 were validated by RT-PCR. Some of the lincRNAs expressed in muscle are located within quantitative trait loci for meat quality traits. CONCLUSIONS: Our study provides a glimpse into the lincRNA content of bovine muscle and will facilitate future experimental studies to unravel the function of these molecules. It may prove useful to elucidate their effect on mechanisms underlying the genetic variability of meat quality traits. This catalog will complement the list of lincRNAs already discovered in cattle and therefore will help to better annotate the bovine genome

    Comparison of whole-genome (13X) and capture (87X) resequencing methods for SNP and genotype callings

    No full text
    The number of polymorphisms identified with next-generation sequencing approaches depends directly on the sequencing depth and therefore on the experimental cost. Although higher levels of depth ensure more sensitive and more specific SNP calls, economic constraints limit the increase of depth for whole-genome resequencing (WGS). For this reason, capture resequencing is used for studies focusing on only some specific regions of the genome. However, several biases in capture resequencing are known to have a negative impact on the sensitivity of SNP detection. Within this framework, the aim of this study was to compare the accuracy of WGS and capture resequencing on SNP detection and genotype calling, which differ in terms of both sequencing depth and biases. Indeed, we have evaluated the SNP calling and genotyping accuracy in a WGS dataset (13X) and in a capture resequencing dataset (87X) performed on 11 individuals. The percentage of SNPs not identified due to a sevenfold sequencing depth decrease was estimated at 7.8% using a down-sampling procedure on the capture sequencing dataset. A comparison of the 87X capture sequencing dataset with the WGS dataset revealed that capture-related biases were leading with the loss of 5.2% of SNPs detected with WGS. Nevertheless, when considering the SNPs detected by both approaches, capture sequencing appears to achieve far better SNP genotyping, with about 4.4% of the WGS genotypes that can be considered as erroneous and even 10% focusing on heterozygous genotypes. In conclusion, WGS and capture deep sequencing can be considered equivalent strategies for SNP detection, as the rate of SNPs not identified because of a low sequencing depth in the former is quite similar to SNPs missed because of method biases of the latter. On the other hand, capture deep sequencing clearly appears more adapted for studies requiring great accuracy in genotyping
    • …
    corecore