5,471 research outputs found

    Novel deletions causing pseudoxanthoma elasticum underscore the genomic instability of the ABCC6 region

    Get PDF
    Mutations in ABCC6 cause pseudoxanthoma elasticum (PXE), a heritable disease that affects elastic fibers. Thus far, >200 mutations have been characterized by various PCR-based techniques (primarily direct sequencing), identifying up to 90% of PXE-causing alleles. This study wanted to assess the importance of deletions and insertions in the ABCC6 genomic region, which is known to have a high recombinational potential. To detect ABCC6 deletions/insertions, which can be missed by direct sequencing, multiplex ligation-dependent probe amplification (MLPA) was applied in PXE patients with an incomplete genotype. MLPA was performed in 35 PXE patients with at least one unidentified mutant allele after exonic sequencing and exclusion of the recurrent exon 23-29 deletion. Six multi-exon deletions and four single-exon deletions were detected. Using MLPA in addition to sequencing, we expanded the ABCC6 mutation spectrum with 9 novel deletions and characterized 25% of unidentified disease alleles. Our results further illustrate the instability of the ABCC6 genomic region and stress the importance of screening for deletions in the molecular diagnosis of PXE. Journal of Human Genetics (2010) 55, 112-117; doi: 10.1038/jhg.2009.132; published online 15 January 201

    Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum.

    Get PDF
    BackgroundPlasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5'- and 3'-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum.ResultsUsing binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47%) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6%). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated.ConclusionOur results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq)

    Embryonic stem cell-specific signatures in cancer: insights into genomic regulatory networks and implications for medicine

    Get PDF
    Embryonic stem (ES) cells are of great interest as a model system for studying early developmental processes and because of their potential therapeutic applications in regenerative medicine. Obtaining a systematic understanding of the mechanisms that control the 'stemness' - self-renewal and pluripotency - of ES cells relies on high-throughput tools to define gene expression and regulatory networks at the genome level. Such recently developed systems biology approaches have revealed highly interconnected networks in which multiple regulatory factors act in combination. Interestingly, stem cells and cancer cells share some properties, notably self-renewal and a block in differentiation. Recently, several groups reported that expression signatures that are specific to ES cells are also found in many human cancers and in mouse cancer models, suggesting that these shared features might inform new approaches for cancer therapy. Here, we briefly summarize the key transcriptional regulators that contribute to the pluripotency of ES cells, the factors that account for the common gene expression patterns of ES and cancer cells, and the implications of these observations for future clinical applications.Institute for Cellular and Molecular [email protected]

    Re-annotation and re-analysis of the Campylobacter jejuni NCTC11168 genome sequence

    Get PDF
    BACKGROUND: Campylobacter jejuni is the leading bacterial cause of human gastroenteritis in the developed world. To improve our understanding of this important human pathogen, the C. jejuni NCTC11168 genome was sequenced and published in 2000. The original annotation was a milestone in Campylobacter research, but is outdated. We now describe the complete re-annotation and re-analysis of the C. jejuni NCTC11168 genome using current database information, novel tools and annotation techniques not used during the original annotation. RESULTS: Re-annotation was carried out using sequence database searches such as FASTA, along with programs such as TMHMM for additional support. The re-annotation also utilises sequence data from additional Campylobacter strains and species not available during the original annotation. Re-annotation was accompanied by a full literature search that was incorporated into the updated EMBL file [EMBL: AL111168]. The C. jejuni NCTC11168 re-annotation reduced the total number of coding sequences from 1654 to 1643, of which 90.0% have additional information regarding the identification of new motifs and/or relevant literature. Re-annotation has led to 18.2% of coding sequence product functions being revised. CONCLUSIONS: Major updates were made to genes involved in the biosynthesis of important surface structures such as lipooligosaccharide, capsule and both O- and N-linked glycosylation. This re-annotation will be a key resource for Campylobacter research and will also provide a prototype for the re-annotation and re-interpretation of other bacterial genomes

    N-terminal proteomics assisted profiling of the unexplored translation initiation landscape in Arabidopsis thaliana

    Get PDF
    Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well-and poorly-annotated genomes

    Spatiotemporal Expression of Pregnancy-Specific Glycoprotein Gene rnCGMl in Rat Placenta

    Get PDF
    As a basis towards a better understanding of the role of the pregnancy-specific glycoprotein (PSG) family in the maintenance of pregnancy, detailed investigations are described on the expression of a recently identified rat PSG gene (rnCGM1) at the mRNA and protein levels. Using specific oligonucleotide primers, rnCGM1 transcripts were identified after reverse transcription, polymerase chain reaction, and hybridization with a radiolabelled, internal oligonucleotide. Transcripts were only found in significant amounts in placenta. In situ hybridization visualized rnCGM1 transcripts at day 14 post coitum (p.c.), in secondary trophoblast giant cells and in the spongiotrophoblast. Only those secondary giant cells lining the maternal decidua were positive. In contrast, primary giant cells did not contain rnCGM1 mRNA. At day 18 p.c., rnCGM1. transcripts were almost exclusively detectable in the spongiotrophoblast. No rnCGM1 transcripts were found in rat embryos of these two developmental stages. Rabbit antisera were generated against the amino-terminal immunoglobulin variable-like domain and against a synthetic peptide containing the last 13 carboxy-terminal amino acids of rnCGM1. Bothe antisera recognized a 124 kDa protein in day 18 rat placental extracts as identified by Western blot analysis. The anti-peptide antiserum recognized a 116 kDa protein in the serum of a 14 day p.c. pregnant rat that is absent from the sera of non-pregnant females. Taken together, these results confirm exclusive expression of rnCGM1 in the rat trophoblast, but unlike human PSG, negligible or no expression is found in other organs, such as fetal liver or salivary glands, indicating a more specialized function of rnCGM1. Its spatiotemporal expression pattern is conducive with a potential role of PSG in protecting the fetus against the maternal immune system and/or in regulating the invasive growth of trophoblast cells

    Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana).

    Get PDF
    Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers

    Whole Genome Sequences of Three Treponema pallidum ssp. pertenue Strains: Yaws and Syphilis Treponemes Differ in Less than 0.2% of the Genome Sequence

    Get PDF
    Spirochete Treponema pallidum ssp. pertenue (TPE) is the causative agent of yaws while strains of Treponema pallidum ssp. pallidum (TPA) cause syphilis. Both yaws and syphilis are distinguished on the basis of epidemiological characteristics and clinical symptoms. Neither treponeme can reproduce outside the host organism, which precludes the use of standard molecular biology techniques used to study cultivable pathogens. In this study, we determined high quality whole genome sequences of TPE strains and compared them to known genetic information for T. pallidum ssp. pallidum strains. The genome structure was identical in all three TPE strains and also between TPA and TPE strains. The TPE genome length ranged between 1,139,330 bp and 1,139,744 bp. The overall sequence identity between TPA and TPE genomes was 99.8%, indicating that the two pathogens are extremely closely related. A set of 34 TPE genes (3.5%) encoded proteins containing six or more amino acid replacements or other major sequence changes. These genes more often belonged to the group of genes with predicted virulence and unknown functions suggesting their involvement in infection differences between yaws and syphilis
    corecore