103 research outputs found

    Interpretation of multiple probe sets mapping to the same gene in Affymetrix GeneChips

    Get PDF
    BACKGROUND: Affymetrix GeneChip technology enables the parallel observations of tens of thousands of genes. It is important that the probe set annotations are reliable so that biological inferences can be made about genes which undergo differential expression. Probe sets representing the same gene might be expected to show similar fold changes/z-scores, however this is in fact not the case. RESULTS: We have made a case study of the mouse Surf4, chosen because it is a gene that was reported to be represented by the same eight probe sets on the MOE430A array by both Affymetrix and Bioconductor in early 2004. Only five of the probe sets actually detect Surf4 transcripts. Two of the probe sets detect splice variants of Surf2. We have also studied the expression changes of the eight probe sets in a public-domain microarray experiment. The transcripts for Surf4 are correlated in time, and similarly the transcripts for Surf2 are also correlated in time. However, the transcripts for Surf4 and Surf2 are not correlated. This proof of principle shows that observations of expression can be used to confirm, or otherwise, annotation discrepancies. We have also investigated groups of probe sets on the RAE230A array that are assigned to the same LocusID, but which show large variances in differential expression in any one of three different experiments on rat. The probe set groups with high variances are found to represent cases of alternative splicing, use of alternative poly(A) signals, or incorrect annotations. CONCLUSION: Our results indicate that some probe sets should not be considered as unique measures of transcription, because the individual probes map to more than one transcript dependent upon the biological condition. Our results highlight the need for care when assessing whether groups of probe sets all measure the same transcript

    Megabase deletions of gene deserts result in viable mice

    Full text link
    The functional importance of the approximately 98 percent of mammalian genomes not corresponding to protein coding sequences remain largely un-scrutinized 1. To test experimentally whether some extensive regions of non-coding DNA, referred to as gene deserts 2-4, contain critical functions essential for the viability of the organism, we deleted two large non-coding intervals, 1,511 kb and 845 kb in length, from the mouse genome. Viable mice homozygous for the deletions were generated and were indistinguishable from wild-type littermates with regards to morphology, reproductive fitness, growth, longevity and a variety of parameters assaying general homeostasis. Further in-depth analysis of the expression of genes bracketing the deletions revealed similar expression characteristics in homozygous deletion and wild-type mice. Together, the two deleted segments harbour 1,243 non-coding sequences conserved between humans and rodents (>100bp, 70 percent identity). These studies demonstrate that some large-scale deletions of non-coding DNA can be well tolerated by an organism, bringing into question the role of many human-mouse conserved sequences 5,6, and further supports the existence of potentially "disposable DNAi" in the genomes of mammals

    A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome

    Get PDF
    Citation: Chapman, J. A., Mascher, M., Buluç, A., Barry, K., Georganas, E., Session, A., . . . Rokhsar, D. S. (2015). A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome. Genome Biology, 16(1). doi:10.1186/s13059-015-0582-8Polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible to construct a mapping population. © 2015 Chapman et al. licensee BioMed Central.Additional Authors: Muehlbauer, G. J.;Stein, N.;Rokhsar, D. S

    Ancestral bias in the Hras1 gene and distal Chromosome 7 among inbred mice

    Get PDF
    Inbred strains of mice vary in their frequency of liver tumors initiated by a mutation in the Hras1 (H-ras) proto-oncogene. We sequenced 4.5 kb of the Hras1 gene on distal Chr 7 in a diverse set of 12 commonly used laboratory inbred strains of mice and detected no sequence variation to account for strain-specific differences in Hras1 mutation prevalence. Furthermore, the Hras1 sequence is essentially monoallelic for an ancestral gene derived from the M. m. domesticus species. To determine if the monoallelism and associated low rate of polymorphism are unique to Hras1 or representative of the general chromosomal locale, we extended the sequence analysis to 12 genes in the final 8 Mb of distal Chr 7. A region of at least 2.5 Mb that encompasses several genes, including Hras1 and the H19/Igf2 loci, demonstrates virtually no sequence variation. The 12 inbred strains share one dominant haplotype derived from the M. m. domesticus allele. Chromosomal regions flanking the monoallelic segment exhibit a significantly higher rate of variation and multiple haplotypes, a majority of which are attributed to M. m. domesticus or M. m. musculus ancestry

    Microarray-Based Sketches of the HERV Transcriptome Landscape

    Get PDF
    Human endogenous retroviruses (HERVs) are spread throughout the genome and their long terminal repeats (LTRs) constitute a wide collection of putative regulatory sequences. Phylogenetic similarities and the profusion of integration sites, two inherent characteristics of transposable elements, make it difficult to study individual locus expression in a large-scale approach, and historically apart from some placental and testis-regulated elements, it was generally accepted that HERVs are silent due to epigenetic control. Herein, we have introduced a generic method aiming to optimally characterize individual loci associated with 25-mer probes by minimizing cross-hybridization risks. We therefore set up a microarray dedicated to a collection of 5,573 HERVs that can reasonably be assigned to a unique genomic position. We obtained a first view of the HERV transcriptome by using a composite panel of 40 normal and 39 tumor samples. The experiment showed that almost one third of the HERV repertoire is indeed transcribed. The HERV transcriptome follows tropism rules, is sensitive to the state of differentiation and, unexpectedly, seems not to correlate with the age of the HERV families. The probeset definition within the U3 and U5 regions was used to assign a function to some LTRs (i.e. promoter or polyA) and revealed that (i) autonomous active LTRs are broadly subjected to operational determinism (ii) the cellular gene density is substantially higher in the surrounding environment of active LTRs compared to silent LTRs and (iii) the configuration of neighboring cellular genes differs between active and silent LTRs, showing an approximately 8 kb zone upstream of promoter LTRs characterized by a drastic reduction in sense cellular genes. These gathered observations are discussed in terms of virus/host adaptive strategies, and together with the methods and tools developed for this purpose, this work paves the way for further HERV transcriptome projects

    Transcriptional activity and strain-specific history of mouse pseudogenes

    Get PDF
    Abstract: Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the mouse.pseudogene.org resource). We also annotate 165 unitary pseudogenes in mouse, and 303, in human. The overall pseudogene repertoire in mouse is similar to that in human in terms of size, biotype distribution, and family composition (e.g. with GAPDH and ribosomal proteins being the largest families). Notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of all pseudogenes are unique, reflecting strain-specific evolution. Finally, we find that ~15% of the mouse pseudogenes are transcribed, and that highly transcribed parent genes tend to give rise to many processed pseudogenes
    • …
    corecore