119 research outputs found

    Assessment of replicate bias in 454 pyrosequencing and a multi-purpose read-filtering tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Roche 454 pyrosequencing platform is often considered the most versatile of the Next Generation Sequencing technology platforms, permitting the sequencing of large genomes, the analysis of variations or the study of transcriptomes. A recent reported bias leads to the production of multiple reads for a unique DNA fragment in a random manner within a run. This bias has a direct impact on the quality of the measurement of the representation of the fragments using the reads. Other cleaning steps are usually performed on the reads before assembly or alignment.</p> <p>Findings</p> <p>PyroCleaner is a software module intended to clean 454 pyrosequencing reads in order to ease the assembly process. This program is a free software and is distributed under the terms of the GNU General Public License as published by the Free Software Foundation. It implements several filters using criteria such as read duplication, length, complexity, base-pair quality and number of undetermined bases. It also permits to clean flowgram files (.sff) of paired-end sequences generating on one hand validated paired-ends file and the other hand single read file.</p> <p>Conclusions</p> <p>Read cleaning has always been an important step in sequence analysis. The pyrocleaner python module is a Swiss knife dedicated to 454 reads cleaning. It includes commonly used filters as well as specialised ones such as duplicated read removal and paired-end read verification.</p

    Whole-exome sequencing in osteosarcoma reveals important heterogeneity of genetic alterations

    Get PDF
    BACKGROUND: Whole-genome sequencing studies have recently shown that osteosarcomas (OSs) display high rates of structural variation, i.e. they contain many somatic mutations and copy number alterations. TP53 and RB1 show recurrent somatic alterations in concordant studies, suggesting that they could be key players in bone oncogenesis. PATIENTS AND METHODS: we carried out whole-genome sequencing of DNA from seven high-grade OS samples matched with normal tissue from the same patients. RESULTS: We confirmed the presence of genetic alterations of the TP53 (including novel unreported mutations) and RB1 genes. Most interestingly, we identified a total of 84 point mutations and 4 deletions related to 82 different genes in OS samples, of which only 15 have been previously reported. Interestingly, the number of mutated genes (ranging from 4 to 8) was lower in TP53mut cases compared with TP53wt cases (ranging from 14 to 45). This was also true for the mutated RB1 case. We also observed that a dedifferentiated OS harboring MDM2 amplification did not carry any other mutations. CONCLUSION: This study suggests that bone oncogenesis driven by TP53 or RB1 mutations occurs on a background of relative genetic stability and that the dedifferentiated OS subtype represents a clinico-pathological entity with distinct oncogenic mechanisms and thus requires different therapeutic managemen

    Daily transcriptomes of the copepod Calanus finmarchicus during the summer solstice at high Arctic latitudes

    Get PDF
    The zooplankter Calanus finmarchicus is a member of the so-called “Calanus Complex”, a group of copepods that constitutes a key element of the Arctic polar marine ecosystem, providing a crucial link between primary production and higher trophic levels. Climate change induces the shift of C. finmarchicus to higher latitudes with currently unknown impacts on its endogenous timing. Here we generated a daily transcriptome of C. finmarchicus at two high Arctic stations, during the more extreme time of Midnight Sun, the summer solstice. While the southern station (74.5 °N) was sea ice-free, the northern one (82.5 °N) was sea ice-covered. The mRNAs of the 42 samples have been sequenced with an average of 126 ± 5 million reads (mean ± SE) per sample, and aligned to the reference transcriptome. We detail the quality assessment of the datasets and the complete annotation procedure, providing the possibility to investigate daily gene expression of this ecologically important species at high Arctic latitudes, and to compare gene expression according to latitude and sea ice-coverage

    Widely rhythmic transcriptome in Calanus finmarchicus during the high Arctic summer solstice period

    Get PDF
    Solar light/dark cycles and seasonal photoperiods underpin daily and annual rhythms of life on Earth. Yet, the Arctic is characterized by severalmonths of permanent illumination (‘‘midnight sun’’). To determine the persistence of 24h rhythms during the midnight sun, we investigated transcriptomic dynamics in the copepod Calanus finmarchicus during the summer solstice period in the Arctic, with the lowest diel oscillation and the highest altitude of the sun’s position. Here we reveal that in these extreme photic conditions, a widely rhythmic daily transcriptome exists, showing that very weak solar cues are sufficient to entrain organisms. Furthermore, at extremely high latitudes and under sea-ice, gene oscillations become re-organized to include <24h rhythms. Environmental synchronization may therefore be modulated to include non-photic signals (i.e. tidal cycles). The ability of zooplankton to be synchronized by extremely weak diel and potentially tidal cycles, may confer an adaptive temporal reorganization of biological processes at high latitudes

    A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence motif for regulation

    Get PDF
    Bioinformatic analysis of the intergenic regions of Staphylococcus aureus predicted multiple regulatory regions. From this analysis, we characterized 11 novel noncoding RNAs (RsaA‐K) that are expressed in several S. aureus strains under different experimental conditions. Many of them accumulate in the late-exponential phase of growth. All ncRNAs are stable and their expression is Hfq-independent. The transcription of several of them is regulated by the alternative sigma B factor (RsaA, D and F) while the expression of RsaE is agrA-dependent. Six of these ncRNAs are specific to S. aureus, four are conserved in other Staphylococci, and RsaE is also present in Bacillaceae. Transcriptomic and proteomic analysis indicated that RsaE regulates the synthesis of proteins involved in various metabolic pathways. Phylogenetic analysis combined with RNA structure probing, searches for RsaE‐mRNA base pairing, and toeprinting assays indicate that a conserved and unpaired UCCC sequence motif of RsaE binds to target mRNAs and prevents the formation of the ribosomal initiation complex. This study unexpectedly shows that most of the novel ncRNAs carry the conserved C−rich motif, suggesting that they are members of a class of ncRNAs that target mRNAs by a shared mechanis

    Determination of the pressure in micrometric bubbles in irradiated nuclear fuels

    Get PDF
    In oxide nuclear fuels, at high burn-up or during high temperature periods such as ramp tests, out-of- pile heating tests, or any irradiations at high linear heat rates, fission gases can form micrometric or quasi-micrometric bubbles. During nominal operations, these bubbles participate to the pellet swelling, to the decrease of the fuel thermal conductivity and are involved in the mechanisms leading to fission gas release. During events involving a temperature increase, the resulting increase in the internal pres- sure of the bubbles might play a role in fuel fragmentation and in the opening of grain boundaries. The gas densities inside these bubbles are therefore one of the useful experimental information for the un- derstanding of the fuel behaviour, and for the fuel behaviour code progress and validation. Two methods were developed to evaluate the gas density in the quasi-micrometric bubbles, using electron probe micro analyser, secondary ion mass spectrometry and focused ion beam scanning electron microscope together. The first method provides a mean gas density for all quasi-micrometric bubbles in a given area. The sec- ond method provides a gas density in a single selected bubble. In addition to the gas density, the 3D size and shape of the selected bubble is measured and can be related to the gas density result. In this work, these methods were applied to the bubbles formed in the centre of a PWR Cr doped UO 2 at 38.8 GWd/t U after a ramp test in the Osiris reactor, with a 12 h plateau at 470 W/cm, and to the bubbles formed in a PWR Cr doped UO 2 at 62.8 GWd/t U in the centre of the pellet and on the bubbles of the high burn-up structure on the rim. Both show the high pressures reached in these bubbles.CEA-DES, EDF and Framatom

    Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the <it>Quercus </it>family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity.</p> <p>Results</p> <p>We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser <url>http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html</url>.</p> <p>Conclusions</p> <p>This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.</p

    Non PCR-amplified Transcripts and AFLP fragments as reduced representations of the quail genome for 454 Titanium sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>SNP (Single Nucleotide Polymorphism) discovery is now routinely performed using high-throughput sequencing of reduced representation libraries. Our objective was to adapt 454 GS FLX based sequencing methodologies in order to obtain the largest possible dataset from two reduced representations libraries, produced by AFLP (Amplified Fragment Length Polymorphism) for genomic DNA, and EST (Expressed Sequence Tag) for the transcribed fraction of the genome.</p> <p>Findings</p> <p>The expressed fraction was obtained by preparing cDNA libraries without PCR amplification from quail embryo and brain. To optimize the information content for SNP analyses, libraries were prepared from individuals selected in three quail lines and each individual in the AFLP library was tagged. Sequencing runs produced 399,189 sequence reads from cDNA and 373,484 from genomic fragments, covering close to 250 Mb of sequence in total.</p> <p>Conclusions</p> <p>Both methods used to obtain reduced representations for high-throughput sequencing were successful after several improvements.</p> <p>The protocols may be used for several sequencing applications, such as <it>de novo </it>sequencing, tagged PCR fragments or long fragment sequencing of cDNA.</p
    • 

    corecore