424 research outputs found

    Expedited batch processing and analysis of transposon insertions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in vertebrates. Each transposable element family may have thousands or tens of thousands of individual copies within a given genome, and therefore it can take an exorbitant amount of time and effort to process data in a meaningful fashion.</p> <p>Findings</p> <p>In order to combat this problem, we developed a set of bioinformatics techniques and programs to streamline the analysis. This includes a unique Perl script which automates the process of taking BLAST, Repeatmasker and similar data to extract and manipulate the hit sequences from the genome. This script, called Process_hits uses an object-oriented methodology to compile all hit locations from a given file for processing, organize this data into useable categories, and output it in multiple formats.</p> <p>Conclusions</p> <p>The program proved capable of handling large amounts of transposon data in an efficient fashion. It is equipped with a number of useful sub-functions, each of which is contained within its own sub-module to allow for greater expandability and as a foundation for future program design.</p

    GeneWaltz--A new method for reducing the false positives of gene finding

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying protein-coding regions in genomic sequences is an essential step in genome analysis. It is well known that the proportion of false positives among genes predicted by current methods is high, especially when the exons are short. These false positives are problematic because they waste time and resources of experimental studies.</p> <p>Methods</p> <p>We developed GeneWaltz, a new filtering method that reduces the risk of false positives in gene finding. GeneWaltz utilizes a codon-to-codon substitution matrix that was constructed by comparing protein-coding regions from orthologous gene pairs between mouse and human genomes. Using this matrix, a scoring scheme was developed; it assigned higher scores to coding regions and lower scores to non-coding regions. The regions with high scores were considered candidate coding regions. One-dimensional Karlin-Altschul statistics was used to test the significance of the coding regions identified by GeneWaltz.</p> <p>Results</p> <p>The proportion of false positives among genes predicted by GENSCAN and Twinscan were high, especially when the exons were short. GeneWaltz significantly reduced the ratio of false positives to all positives predicted by GENSCAN and Twinscan, especially when the exons were short.</p> <p>Conclusions</p> <p>GeneWaltz will be helpful in experimental genomic studies. GeneWaltz binaries and the matrix are available online at <url>http://en.sourceforge.jp/projects/genewaltz/</url>.</p

    Characteristics of transposable element exonization within human and mouse

    Get PDF
    Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure

    Dictyostelium discoideum as a Model to Study Inositol Polyphosphates and Inorganic Polyphosphate

    Get PDF
    The yeast Saccharomyces cerevisiae has given us much information on the metabolism and function of inositol polyphosphates and inorganic polyphosphate. To expand our knowledge of the metabolic as well as functional connections between inositol polyphosphates and inorganic polyphosphate, we have refined and developed techniques to extract and analyze these molecules in a second eukaryotic experimental model, the amoeba Dictyostelium discoideum. This amoeba, possessing a well-defined developmental program, is ideal to study physiological changes in the levels of inositol polyphosphates and inorganic polyphosphate, since levels of both molecules increase at late stages of development. We detail here the methods used to extract inositol polyphosphates using perchloric acid and inorganic polyphosphate using acidic phenol. We also present the postextraction procedures to visualize and quantify these molecules by polyacrylamide gel electrophoresis and by malachite green assay

    SNPs Occur in Regions with Less Genomic Sequence Conservation

    Get PDF
    Rates of SNPs (single nucleotide polymorphisms) and cross-species genomic sequence conservation reflect intra- and inter-species variation, respectively. Here, I report SNP rates and genomic sequence conservation adjacent to mRNA processing regions and show that, as expected, more SNPs occur in less conserved regions and that functional regions have fewer SNPs. Results are confirmed using both mouse and human data. Regions include protein start codons, 3′ splice sites, 5′ splice sites, protein stop codons, predicted miRNA binding sites, and polyadenylation sites. Throughout, SNP rates are lower and conservation is higher at regulatory sites. Within coding regions, SNP rates are highest and conservation is lowest at codon position three and the fewest SNPs are found at codon position two, reflecting codon degeneracy for amino acid encoding. Exon splice sites show high conservation and very low SNP rates, reflecting both splicing signals and protein coding. Relaxed constraint on the codon third position is dramatically seen when separating exonic SNP rates based on intron phase. At polyadenylation sites, a peak of conservation and low SNP rate occurs from 30 to 17 nt preceding the site. This region is highly enriched for the sequence AAUAAA, reflecting the location of the conserved polyA signal. miRNA 3′ UTR target sites are predicted incorporating interspecies genomic sequence conservation; SNP rates are low in these sites, again showing fewer SNPs in conserved regions. Together, these results confirm that SNPs, reflecting recent genetic variation, occur more frequently in regions with less evolutionarily conservation

    Transcriptional Profiling Uncovers a Network of Cholesterol-Responsive Atherosclerosis Target Genes

    Get PDF
    Despite the well-documented effects of plasma lipid lowering regimes halting atherosclerosis lesion development and reducing morbidity and mortality of coronary artery disease and stroke, the transcriptional response in the atherosclerotic lesion mediating these beneficial effects has not yet been carefully investigated. We performed transcriptional profiling at 10-week intervals in atherosclerosis-prone mice with human-like hypercholesterolemia and a genetic switch to lower plasma lipoproteins (Ldlr−/−Apo100/100 Mttpflox/flox Mx1-Cre). Atherosclerotic lesions progressed slowly at first, then expanded rapidly, and plateaued after advanced lesions formed. Analysis of lesion expression profiles indicated that accumulation of lipid-poor macrophages reached a point that led to the rapid expansion phase with accelerated foam-cell formation and inflammation, an interpretation supported by lesion histology. Genetic lowering of plasma cholesterol (e.g., lipoproteins) at this point all together prevented the formation of advanced plaques and parallel transcriptional profiling of the atherosclerotic arterial wall identified 37 cholesterol-responsive genes mediating this effect. Validation by siRNA-inhibition in macrophages incubated with acetylated-LDL revealed a network of eight cholesterol-responsive atherosclerosis genes regulating cholesterol-ester accumulation. Taken together, we have identified a network of atherosclerosis genes that in response to plasma cholesterol-lowering prevents the formation of advanced plaques. This network should be of interest for the development of novel atherosclerosis therapies

    Defending the genome from the enemy within:mechanisms of retrotransposon suppression in the mouse germline

    Get PDF
    The viability of any species requires that the genome is kept stable as it is transmitted from generation to generation by the germ cells. One of the challenges to transgenerational genome stability is the potential mutagenic activity of transposable genetic elements, particularly retrotransposons. There are many different types of retrotransposon in mammalian genomes, and these target different points in germline development to amplify and integrate into new genomic locations. Germ cells, and their pluripotent developmental precursors, have evolved a variety of genome defence mechanisms that suppress retrotransposon activity and maintain genome stability across the generations. Here, we review recent advances in understanding how retrotransposon activity is suppressed in the mammalian germline, how genes involved in germline genome defence mechanisms are regulated, and the consequences of mutating these genome defence genes for the developing germline

    Diffractive Dijet Production at sqrt(s)=630 and 1800 GeV at the Fermilab Tevatron

    Get PDF
    We report a measurement of the diffractive structure function FjjDF_{jj}^D of the antiproton obtained from a study of dijet events produced in association with a leading antiproton in pˉp\bar pp collisions at s=630\sqrt s=630 GeV at the Fermilab Tevatron. The ratio of FjjDF_{jj}^D at s=630\sqrt s=630 GeV to FjjDF_{jj}^D obtained from a similar measurement at s=1800\sqrt s=1800 GeV is compared with expectations from QCD factorization and with theoretical predictions. We also report a measurement of the ξ\xi (xx-Pomeron) and β\beta (xx of parton in Pomeron) dependence of FjjDF_{jj}^D at s=1800\sqrt s=1800 GeV. In the region 0.035<ξ<0.0950.035<\xi<0.095, t<1|t|<1 GeV2^2 and β<0.5\beta<0.5, FjjD(β,ξ)F_{jj}^D(\beta,\xi) is found to be of the form β1.0±0.1ξ0.9±0.1\beta^{-1.0\pm 0.1} \xi^{-0.9\pm 0.1}, which obeys β\beta-ξ\xi factorization.Comment: LaTeX, 9 pages, Submitted to Phys. Rev. Letter

    Multi-species sequence comparison reveals dynamic evolution of the elastin gene that has involved purifying selection and lineage-specific insertions/deletions

    Get PDF
    BACKGROUND: The elastin gene (ELN) is implicated as a factor in both supravalvular aortic stenosis (SVAS) and Williams Beuren Syndrome (WBS), two diseases involving pronounced complications in mental or physical development. Although the complete spectrum of functional roles of the processed gene product remains to be established, these roles are inferred to be analogous in human and mouse. This view is supported by genomic sequence comparison, in which there are no large-scale differences in the ~1.8 Mb sequence block encompassing the common region deleted in WBS, with the exception of an overall reversed physical orientation between human and mouse. RESULTS: Conserved synteny around ELN does not translate to a high level of conservation in the gene itself. In fact, ELN orthologs in mammals show more sequence divergence than expected for a gene with a critical role in development. The pattern of divergence is non-conventional due to an unusually high ratio of gaps to substitutions. Specifically, multi-sequence alignments of eight mammalian sequences reveal numerous non-aligning regions caused by species-specific insertions and deletions, in spite of the fact that the vast majority of aligning sites appear to be conserved and undergoing purifying selection. CONCLUSIONS: The pattern of lineage-specific, in-frame insertions/deletions in the coding exons of ELN orthologous genes is unusual and has led to unique features of the gene in each lineage. These differences may indicate that the gene has a slightly different functional mechanism in mammalian lineages, or that the corresponding regions are functionally inert. Identified regions that undergo purifying selection reflect a functional importance associated with evolutionary pressure to retain those features
    corecore