147 research outputs found

    RIPCAL: a tool for alignment-based analysis of repeat-induced point mutations in fungal genomic sequences

    Get PDF
    Background Repeat-induced point mutation (RIP) is a fungal-specific genome defence mechanism that alters the sequences of repetitive DNA, thereby inactivating coding genes. Repeated DNA sequences align between mating and meiosis and both sequences undergo C:G to T:A transitions. In most fungi these transitions preferentially affect CpA di-nucleotides thus altering the frequency of certain di-nucleotides in the affected sequences. The majority of previously published in silico analyses were limited to the comparison of ratios of pre- and post-RIP di-nucleotides in putatively RIP-affected sequences – so-called RIP indices. The analysis of RIP is significantly more informative when comparing sequence alignments of repeated sequences. There is, however, a dearth of bioinformatics tools available to the fungal research community for alignment-based RIP analysis of repeat families. Results We present RIPCAL http://www.sourceforge.net/projects/ripcal, a software tool for the automated analysis of RIP in fungal genomic DNA repeats, which performs both RIP index and alignment-based analyses. We demonstrate the ability of RIPCAL to detect RIP within known RIP-affected sequences of Neurospora crassa and other fungi. We also predict and delineate the presence of RIP in the genome of Stagonospora nodorum – a Dothideomycete pathogen of wheat. We show that RIP has affected different members of the S. nodorum rDNA tandem repeat to different extents depending on their genomic contexts. Conclusion The RIPCAL alignment-based method has considerable advantages over RIP indices for the analysis of whole genomes. We demonstrate its application to the recently published genome assembly of S. nodorum

    Casual Compressive Sensing for Gene Network Inference

    Full text link
    We propose a novel framework for studying causal inference of gene interactions using a combination of compressive sensing and Granger causality techniques. The gist of the approach is to discover sparse linear dependencies between time series of gene expressions via a Granger-type elimination method. The method is tested on the Gardner dataset for the SOS network in E. coli, for which both known and unknown causal relationships are discovered

    Model based dynamics analysis in live cell microtubule images

    Get PDF
    Background: The dynamic growing and shortening behaviors of microtubules are central to the fundamental roles played by microtubules in essentially all eukaryotic cells. Traditionally, microtubule behavior is quantified by manually tracking individual microtubules in time-lapse images under various experimental conditions. Manual analysis is laborious, approximate, and often offers limited analytical capability in extracting potentially valuable information from the data. Results: In this work, we present computer vision and machine-learning based methods for extracting novel dynamics information from time-lapse images. Using actual microtubule data, we estimate statistical models of microtubule behavior that are highly effective in identifying common and distinct characteristics of microtubule dynamic behavior. Conclusion: Computational methods provide powerful analytical capabilities in addition to traditional analysis methods for studying microtubule dynamic behavior. Novel capabilities, such as building and querying microtubule image databases, are introduced to quantify and analyze microtubule dynamic behavior

    Query Large Scale Microarray Compendium Datasets Using a Model-Based Bayesian Approach with Variable Selection

    Get PDF
    In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors

    RegNetB: Predicting Relevant Regulator-Gene Relationships in Localized Prostate Tumor Samples

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A central question in cancer biology is what changes cause a healthy cell to form a tumor. Gene expression data could provide insight into this question, but it is difficult to distinguish between a gene that causes a change in gene expression from a gene that is affected by this change. Furthermore, the proteins that regulate gene expression are often themselves not regulated at the transcriptional level. Here we propose a Bayesian modeling framework we term RegNetB that uses mechanistic information about the gene regulatory network to distinguish between factors that cause a change in expression and genes that are affected by the change. We test this framework using human gene expression data describing localized prostate cancer progression.</p> <p>Results</p> <p>The top regulatory relationships identified by RegNetB include the regulation of RLN1, RLN2, by PAX4, the regulation of ACPP (PAP) by JUN, BACH1 and BACH2, and the co-regulation of PGC and GDF15 by MAZ and TAF8. These target genes are known to participate in tumor progression, but the suggested regulatory roles of PAX4, BACH1, BACH2, MAZ and TAF8 in the process is new.</p> <p>Conclusion</p> <p>Integrating gene expression data and regulatory topologies can aid in identifying potentially causal mechanisms for observed changes in gene expression.</p

    The Complete Genome Sequence of ‘Candidatus Liberibacter solanacearum’, the Bacterium Associated with Potato Zebra Chip Disease

    Get PDF
    Zebra Chip (ZC) is an emerging plant disease that causes aboveground decline of potato shoots and generally results in unusable tubers. This disease has led to multi-million dollar losses for growers in the central and western United States over the past decade and impacts the livelihood of potato farmers in Mexico and New Zealand. ZC is associated with ‘Candidatus Liberibacter solanacearum’, a fastidious alpha-proteobacterium that is transmitted by a phloem-feeding psyllid vector, Bactericera cockerelli Sulc. Research on this disease has been hampered by a lack of robust culture methods and paucity of genome sequence information for ‘Ca. L. solanacearum’. Here we present the sequence of the 1.26 Mbp metagenome of ‘Ca. L. solanacearum’, based on DNA isolated from potato psyllids. The coding inventory of the ‘Ca. L. solanacearum’ genome was analyzed and compared to related Rhizobiaceae to better understand ‘Ca. L. solanacearum’ physiology and identify potential targets to develop improved treatment strategies. This analysis revealed a number of unique transporters and pathways, all potentially contributing to ZC pathogenesis. Some of these factors may have been acquired through horizontal gene transfer. Taxonomically, ‘Ca. L. solanacearum’ is related to ‘Ca. L. asiaticus’, a suspected causative agent of citrus huanglongbing, yet many genome rearrangements and several gene gains/losses are evident when comparing these two Liberibacter. species. Relative to ‘Ca. L. asiaticus’, ‘Ca. L. solanacearum’ probably has reduced capacity for nucleic acid modification, increased amino acid and vitamin biosynthesis functionalities, and gained a high-affinity iron transport system characteristic of several pathogenic microbes

    DNA Methylation and Normal Chromosome Behavior in Neurospora Depend on Five Components of a Histone Methyltransferase Complex, DCDC

    Get PDF
    Methylation of DNA and of Lysine 9 on histone H3 (H3K9) is associated with gene silencing in many animals, plants, and fungi. In Neurospora crassa, methylation of H3K9 by DIM-5 directs cytosine methylation by recruiting a complex containing Heterochromatin Protein-1 (HP1) and the DIM-2 DNA methyltransferase. We report genetic, proteomic, and biochemical investigations into how DIM-5 is controlled. These studies revealed DCDC, a previously unknown protein complex including DIM-5, DIM-7, DIM-9, CUL4, and DDB1. Components of DCDC are required for H3K9me3, proper chromosome segregation, and DNA methylation. DCDC-defective strains, but not HP1-defective strains, are hypersensitive to MMS, revealing an HP1-independent function of H3K9 methylation. In addition to DDB1, DIM-7, and the WD40 domain protein DIM-9, other presumptive DCAFs (DDB1/CUL4 associated factors) co-purified with CUL4, suggesting that CUL4/DDB1 forms multiple complexes with distinct functions. This conclusion was supported by results of drug sensitivity tests. CUL4, DDB1, and DIM-9 are not required for localization of DIM-5 to incipient heterochromatin domains, indicating that recruitment of DIM-5 to chromatin is not sufficient to direct H3K9me3. DIM-7 is required for DIM-5 localization and mediates interaction of DIM-5 with DDB1/CUL4 through DIM-9. These data support a two-step mechanism for H3K9 methylation in Neurospora

    Inactivation of the dnaK gene in Clostridium difficile 630 Δerm yields a temperature-sensitive phenotype and increases biofilm-forming ability

    Get PDF
    Abstract Clostridium difficile infection is a growing problem in healthcare settings worldwide and results in a considerable socioeconomic impact. New hypervirulent strains and acquisition of antibiotic resistance exacerbates pathogenesis; however, the survival strategy of C. difficile in the challenging gut environment still remains incompletely understood. We previously reported that clinically relevant heat-stress (37–41 °C) resulted in a classical heat-stress response with up-regulation of cellular chaperones. We used ClosTron to construct an insertional mutation in the dnaK gene of C. difficile 630 Δerm. The dnaK mutant exhibited temperature sensitivity, grew more slowly than C. difficile 630 Δerm and was less thermotolerant. Furthermore, the mutant was non-motile, had 4-fold lower expression of the fliC gene and lacked flagella on the cell surface. Mutant cells were some 50% longer than parental strain cells, and at optimal growth temperatures, they exhibited a 4-fold increase in the expression of class I chaperone genes including GroEL and GroES. Increased chaperone expression, in addition to the non-flagellated phenotype of the mutant, may account for the increased biofilm formation observed. Overall, the phenotype resulting from dnaK disruption is more akin to that observed in Escherichia coli dnaK mutants, rather than those in the Gram-positive model organism Bacillus subtilis

    Timing Constraints of In Vivo Gag Mutations during Primary HIV-1 Subtype C Infection

    Get PDF
    Background: Aiming to answer the broad question “When does mutation occur?” this study examined the time of appearance, dominance, and completeness of in vivo Gag mutations in primary HIV-1 subtype C infection. Methods: A primary HIV-1C infection cohort comprised of 8 acutely and 34 recently infected subjects were followed frequently up to 500 days post-seroconversion (p/s). Gag mutations were analyzed by employing single-genome amplification and direct sequencing. Gag mutations were determined in relation to the estimated time of seroconversion. Time of appearance, dominance, and completeness was compared for different types of in vivo Gag mutations. Results: Reverse mutations to the wild type appeared at a median (IQR) of 62 (44;139) days p/s, while escape mutations from the wild type appeared at 234 (169;326) days p/s (p&lt;0.001). Within the subset of mutations that became dominant, reverse and escape mutations appeared at 54 (30;78) days p/s and 104 (47;198) days p/s, respectively (p&lt;0.001). Among the mutations that reached completeness, reverse and escape mutations appeared at 54 (30;78) days p/s and 90 (44;196) days p/s, respectively (p = 0.006). Time of dominance for reverse mutations to and escape mutations from the wild type was 58 (44;105) days p/s and 219 (90;326) days p/s, respectively (p&lt;0.001). Time of completeness for reverse and escape mutations was 152 (100;176) days p/s and 243 (101;370) days p/s, respectively (p = 0.001). Fitting a Cox proportional hazards model with frailties confirmed a significantly earlier time of appearance (hazard ratio (HR): 2.6; 95% CI: 2.3–3.0), dominance (4.8 (3.4–6.8)), and completeness (3.6 (2.3–5.5)) of reverse mutations to the wild type Gag than escape mutations from the wild type. Some complex mutational pathways in Gag included sequential series of reversions and escapes. Conclusions: The study identified the timing of different types of in vivo Gag mutations in primary HIV-1 subtype C infection in relation to the estimated time of seroconversion. Overall, the in vivo reverse mutations to the wild type occurred significantly earlier than escape mutations from the wild type. This shorter time to incidence of reverse mutations remained in the subsets of in vivo Gag mutations that reached dominance or completeness
    corecore