50 research outputs found

    Post hoc pattern matching: assigning significance to statistically defined expression patterns in single channel microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Researchers using RNA expression microarrays in experimental designs with more than two treatment groups often identify statistically significant genes with ANOVA approaches. However, the ANOVA test does not discriminate which of the multiple treatment groups differ from one another. Thus, <it>post hoc </it>tests, such as linear contrasts, template correlations, and pairwise comparisons are used. Linear contrasts and template correlations work extremely well, especially when the researcher has <it>a priori </it>information pointing to a particular pattern/template among the different treatment groups. Further, all pairwise comparisons can be used to identify particular, treatment group-dependent patterns of gene expression. However, these approaches are biased by the researcher's assumptions, and some treatment-based patterns may fail to be detected using these approaches. Finally, different patterns may have different probabilities of occurring by chance, importantly influencing researchers' conclusions about a pattern and its constituent genes.</p> <p>Results</p> <p>We developed a four step, <it>post hoc </it>pattern matching (PPM) algorithm to automate single channel gene expression pattern identification/significance. First, 1-Way Analysis of Variance (ANOVA), coupled with <it>post hoc </it>'all pairwise' comparisons are calculated for all genes. Second, for each ANOVA-significant gene, all pairwise contrast results are encoded to create unique pattern ID numbers. The # genes found in each pattern in the data is identified as that pattern's 'actual' frequency. Third, using Monte Carlo simulations, those patterns' frequencies are estimated in random data ('random' gene pattern frequency). Fourth, a Z-score for overrepresentation of the pattern is calculated ('actual' against 'random' gene pattern frequencies). We wrote a Visual Basic program (StatiGen) that automates PPM procedure, constructs an Excel workbook with standardized graphs of overrepresented patterns, and lists of the genes comprising each pattern. The visual basic code, installation files for StatiGen, and sample data are available as supplementary material.</p> <p>Conclusion</p> <p>The PPM procedure is designed to augment current microarray analysis procedures by allowing researchers to incorporate all of the information from post hoc tests to establish unique, overarching gene expression patterns in which there is no overlap in gene membership. In our hands, PPM works well for studies using from three to six treatment groups in which the researcher is interested in treatment-related patterns of gene expression. Hardware/software limitations and extreme number of theoretical expression patterns limit utility for larger numbers of treatment groups. Applied to a published microarray experiment, the StatiGen program successfully flagged patterns that had been manually assigned in prior work, and further identified other gene expression patterns that may be of interest. Thus, over a moderate range of treatment groups, PPM appears to work well. It allows researchers to assign statistical probabilities to patterns of gene expression that fit <it>a priori </it>expectations/hypotheses, it preserves the data's ability to show the researcher interesting, yet unanticipated gene expression patterns, and assigns the majority of ANOVA-significant genes to non-overlapping patterns.</p

    Interaction between maternal caffeine intake during pregnancy and CYP1A2 C164A polymorphism affects infant birth size in the Hokkaido study

    Get PDF
    BACKGROUND: Caffeine, 1,3,7-trimethylxanthine, is widely consumed by women of reproductive age. Although caffeine has been proposed to inhibit fetal growth, previous studies on the effects of caffeine on infant birth size have yielded inconsistent findings. This inconsistency may result from failure to account for individual differences in caffeine metabolism related to polymorphisms in the gene for CYP1A2, the major caffeine-metabolizing enzyme. METHODS: Five hundred fourteen Japanese women participated in a prospective cohort study in Sapporo, Japan, from 2002 to 2005, and 476 mother-child pairs were included for final analysis. RESULTS: Caffeine intake was not significantly associated with mean infant birth size. When caffeine intake and CYP1A2 C164A genotype were considered together, women with the AA genotype and caffeine intake of >= 300 mg per day had a mean reduction in infant birth head circumference of 0.8 cm relative to the reference group after adjusting for confounding factors. In a subgroup analysis, only nonsmokers with the AA genotype and caffeine intake of >= 300 mg per day had infants with decreased birth weight (mean reduction, 277 g) and birth head circumference (mean reduction, 1.0 cm). CONCLUSION: Nonsmokers who rapidly metabolize caffeine may be at increased risk for having infants with decreased birth size when consuming >= 300 mg of caffeine per day.This is the author's accepted version of their manuscript of the following article: Sasaki, et al. Pediatric Research (2017) 82, 19–28. The final publication is available at: http://dx.doi.org/10.1038/pr.2017.7

    Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera

    Get PDF
    Background: Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Results: Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Conclusions: Our results identify a burst of genetic novelty amongst sperm proteins that may be associated with the origin of heteromorphic spermatogenesis in ancestral Lepidoptera and/or the subsequent evolution of this system. This pattern of genomic diversification is distinct from the remainder of the genome and thus suggests that this transition has had a marked impact on lepidopteran genome evolution. The identification of abundant sperm proteins unique to Lepidoptera, including proteins distinct between specific lineages, will accelerate future functional studies aiming to understand the developmental origin of dichotomous spermatogenesis and the functional diversification of the fertilization incompetent apyrene sperm morph

    Drivers of genetic diversity in secondary metabolic gene clusters within a fungal species

    Get PDF
    Drivers of genetic diversity in secondary metabolic gene clusters within a fungal speciesFilamentous fungi produce a diverse array of secondary metabolites (SMs) critical for defense, virulence, and communication. The metabolic pathways that produce SMs are found in contiguous gene clusters in fungal genomes, an atypical arrangement for metabolic pathways in other eukaryotes. Comparative studies of filamentous fungal species have shown that SM gene clusters are often either highly divergent or uniquely present in one or a handful of species, hampering efforts to determine the genetic basis and evolutionary drivers of SM gene cluster divergence. Here, we examined SM variation in 66 cosmopolitan strains of a single species, the opportunistic human pathogen Aspergillus fumigatus. Investigation of genome-wide within-species variation revealed 5 general types of variation in SM gene clusters: nonfunctional gene polymorphisms; gene gain and loss polymorphisms; whole cluster gain and loss polymorphisms; allelic polymorphisms, in which different alleles corresponded to distinct, nonhomologous clusters; and location polymorphisms, in which a cluster was found to differ in its genomic location across strains. These polymorphisms affect the function of representative A. fumigatus SM gene clusters, such as those involved in the production of gliotoxin, fumigaclavine, and helvolic acid as well as the function of clusters with undefined products. In addition to enabling the identification of polymorphisms, the detection of which requires extensive genome-wide synteny conservation (e.g., mobile gene clusters and nonhomologous cluster alleles), our approach also implicated multiple underlying genetic drivers, including point mutations, recombination, and genomic deletion and insertion events as well as horizontal gene transfer from distant fungi. Finally, most of the variants that we uncover within A. fumigatus have been previously hypothesized to contribute to SM gene cluster diversity across entire fungal classes and phyla. We suggest that the drivers of genetic diversity operating within a fungal species shown here are sufficient to explain SM cluster macroevolutionary patterns.National Science Foundation (grant number DEB-1442113). Received by AR. U.S. National Library of Medicine training grant (grant number 2T15LM007450). Received by ALL. Conselho Nacional de Desenvolvimento Cientı´fico e 573 Tecnológico. Northern Portugal Regional Operational Programme (grant number NORTE-01- 0145-FEDER-000013). Received by FR. Fundação de Amparo à Pesquisa do 572 Estado de São Paulo. Received by GHG. National Institutes of Health (grant number R01 AI065728-01). Received by NPK. National Science Foundation (grant number IOS-1401682). Received by JHW. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.info:eu-repo/semantics/publishedVersio

    Whole-genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow.

    Get PDF
    The hundreds of cichlid fish species in Lake Malawi constitute the most extensive recent vertebrate adaptive radiation. Here we characterize its genomic diversity by sequencing 134 individuals covering 73 species across all major lineages. The average sequence divergence between species pairs is only 0.1-0.25%. These divergence values overlap diversity within species, with 82% of heterozygosity shared between species. Phylogenetic analyses suggest that diversification initially proceeded by serial branching from a generalist Astatotilapia-like ancestor. However, no single species tree adequately represents all species relationships, with evidence for substantial gene flow at multiple times. Common signatures of selection on visual and oxygen transport genes shared by distantly related deep-water species point to both adaptive introgression and independent selection. These findings enhance our understanding of genomic processes underlying rapid species diversification, and provide a platform for future genetic analysis of the Malawi radiation

    Low Prevalence of Conjunctival Infection with Chlamydia trachomatis in a Treatment-Naïve Trachoma-Endemic Region of the Solomon Islands

    Get PDF
    Trachoma is endemic in several Pacific Island states. Recent surveys across the Solomon Islands indicated that whilst trachomatous inflammation-follicular (TF) was present at levels warranting intervention, the prevalence of trachomatous trichiasis (TT) was low. We set out to determine the relationship between chlamydial infection and trachoma in this population. We conducted a population-based trachoma prevalence survey of 3674 individuals from two Solomon Islands provinces. Participants were examined for clinical signs of trachoma. Conjunctival swabs were collected from all children aged 1-9 years. We tested swabs for Chlamydia trachomatis (Ct) DNA using droplet digital PCR. Chlamydial DNA from positive swabs was enriched and sequenced for use in phylogenetic analysis. We observed a moderate prevalence of TF in children aged 1-9 years (n = 296/1135, 26.1%) but low prevalence of trachomatous inflammation-intense (TI) (n = 2/1135, 0.2%) and current Ct infection (n = 13/1002, 1.3%) in children aged 1-9 years, and TT in those aged 15+ years (n = 2/2061, 0.1%). Ten of 13 (76.9%) cases of infection were in persons with TF or TI (p = 0.0005). Sequence analysis of the Ct-positive samples yielded 5/13 (38%) complete (>95% coverage of reference) genome sequences, and 8/13 complete plasmid sequences. Complete sequences all aligned most closely to ocular serovar reference strains. The low prevalence of TT, TI and Ct infection that we observed are incongruent with the high proportion of children exhibiting signs of TF. TF is present at levels that apparently warrant intervention, but the scarcity of other signs of trachoma indicates the phenotype is mild and may not pose a significant public health threat. Our data suggest that, whilst conjunctival Ct infection appears to be present in the region, it is present at levels that are unlikely to be the dominant driving force for TF in the population. This could be one reason for the low prevalence of TT observed during the study
    corecore