91 research outputs found

    Comparative analysis and visualization of multiple collinear genomes

    Get PDF
    Abstract Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. Results We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. Conclusions Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains

    From Classical Genetics to Quantitative Genetics to Systems Biology: Modeling Epistasis

    Get PDF
    Gene expression data has been used in lieu of phenotype in both classical and quantitative genetic settings. These two disciplines have separate approaches to measuring and interpreting epistasis, which is the interaction between alleles at different loci. We propose a framework for estimating and interpreting epistasis from a classical experiment that combines the strengths of each approach. A regression analysis step accommodates the quantitative nature of expression measurements by estimating the effect of gene deletions plus any interaction. Effects are selected by significance such that a reduced model describes each expression trait. We show how the resulting models correspond to specific hierarchical relationships between two regulator genes and a target gene. These relationships are the basic units of genetic pathways and genomic system diagrams. Our approach can be extended to analyze data from a variety of experiments, multiple loci, and multiple environments

    Evidence that the Human Pathogenic Fungus Cryptococcus neoformans var. grubii May Have Evolved in Africa

    Get PDF
    Most of the species of fungi that cause disease in mammals, including Cryptococcus neoformans var. grubii (serotype A), are exogenous and non-contagious. Cryptococcus neoformans var. grubii is associated worldwide with avian and arboreal habitats. This airborne, opportunistic pathogen is profoundly neurotropic and the leading cause of fungal meningitis. Patients with HIV/AIDS have been ravaged by cryptococcosis – an estimated one million new cases occur each year, and mortality approaches 50%. Using phylogenetic and population genetic analyses, we present evidence that C. neoformans var. grubii may have evolved from a diverse population in southern Africa. Our ecological studies support the hypothesis that a few of these strains acquired a new environmental reservoir, the excreta of feral pigeons (Columba livia), and were globally dispersed by the migration of birds and humans. This investigation also discovered a novel arboreal reservoir for highly diverse strains of C. neoformans var. grubii that are restricted to southern Africa, the mopane tree (Colophospermum mopane). This finding may have significant public health implications because these primal strains have optimal potential for evolution and because mopane trees contribute to the local economy as a source of timber, folkloric remedies and the edible mopane worm

    Dissecting Genetic Networks Underlying Complex Phenotypes: The Theoretical Framework

    Get PDF
    Great progress has been made in genetic dissection of quantitative trait variation during the past two decades, but many studies still reveal only a small fraction of quantitative trait loci (QTLs), and epistasis remains elusive. We integrate contemporary knowledge of signal transduction pathways with principles of quantitative and population genetics to characterize genetic networks underlying complex traits, using a model founded upon one-way functional dependency of downstream genes on upstream regulators (the principle of hierarchy) and mutual functional dependency among related genes (functional genetic units, FGU). Both simulated and real data suggest that complementary epistasis contributes greatly to quantitative trait variation, and obscures the phenotypic effects of many ‘downstream’ loci in pathways. The mathematical relationships between the main effects and epistatic effects of genes acting at different levels of signaling pathways were established using the quantitative and population genetic parameters. Both loss of function and “co-adapted” gene complexes formed by multiple alleles with differentiated functions (effects) are predicted to be frequent types of allelic diversity at loci that contribute to the genetic variation of complex traits in populations. Downstream FGUs appear to be more vulnerable to loss of function than their upstream regulators, but this vulnerability is apparently compensated by different FGUs of similar functions. Other predictions from the model may account for puzzling results regarding responses to selection, genotype by environment interaction, and the genetic basis of heterosis

    Allelic Variation and Differential Expression of the mSIN3A Histone Deacetylase Complex Gene Arid4b Promote Mammary Tumor Growth and Metastasis

    Get PDF
    Accumulating evidence suggests that breast cancer metastatic progression is modified by germline polymorphism, although specific modifier genes have remained largely undefined. In the current study, we employ the MMTV-PyMT transgenic mouse model and the AKXD panel of recombinant inbred mice to identify AT–rich interactive domain 4B (Arid4b; NM_194262) as a breast cancer progression modifier gene. Ectopic expression of Arid4b promoted primary tumor growth in vivo as well as increased migration and invasion in vitro, and the phenotype was associated with polymorphisms identified between the AKR/J and DBA/2J alleles as predicted by our genetic analyses. Stable shRNA–mediated knockdown of Arid4b caused a significant reduction in pulmonary metastases, validating a role for Arid4b as a metastasis modifier gene. ARID4B physically interacts with the breast cancer metastasis suppressor BRMS1, and we detected differential binding of the Arid4b alleles to histone deacetylase complex members mSIN3A and mSDS3, suggesting that the mechanism of Arid4b action likely involves interactions with chromatin modifying complexes. Downregulation of the conserved Tpx2 gene network, which is comprised of many factors regulating cell cycle and mitotic spindle biology, was observed concomitant with loss of metastatic efficiency in Arid4b knockdown cells. Consistent with our genetic analysis and in vivo experiments in our mouse model system, ARID4B expression was also an independent predictor of distant metastasis-free survival in breast cancer patients with ER+ tumors. These studies support a causative role of ARID4B in metastatic progression of breast cancer

    Tracking the evolutionary history of Cortinarius species in section Calochroi, with transoceanic disjunct distributions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Cortinarius </it>species in section <it>Calochroi </it>display local, clinal and circumboreal patterns of distribution across the Northern Hemisphere where these ectomycorrhizal fungi occur with host trees throughout their geographical range within a continent, or have disjunct intercontinental distributions, the origins of which are not understood. We inferred evolutionary histories of four species, 1) <it>C</it>. <it>arcuatorum</it>, 2) <it>C. aureofulvus</it>, 3) <it>C</it>. <it>elegantior </it>and 4) <it>C. napus</it>, from populations distributed throughout the Old World, and portions of the New World (Central- and North America) based on genetic variation of 154 haplotype internal transcribed spacer (ITS) sequences from 83 population samples. By describing the population structure of these species across their geographical distribution, we attempt to identify their historical migration and patterns of diversification.</p> <p>Results</p> <p>Models of population structure from nested clade, demographic and coalescent-based analyses revealed genetically differentiated and geographically structured haplotypes in <it>C</it>. <it>arcuatorum </it>and <it>C</it>. <it>elegantior</it>, while <it>C</it>. <it>aureofulvus </it>showed considerably less population structure and <it>C. napus </it>lacked sufficient genetic differentiation to resolve any population structure. Disjunct populations within <it>C</it>. <it>arcuatorum, C. aureofulvus </it>and <it>C</it>. <it>elegantior </it>show little or no morphological differentiation, whereas in <it>C. napus </it>there is a high level of homoplasy and phenotypic plasticity for veil and lamellae colour. The ITS sequences of the type specimens of <it>C. albobrunnoides </it>and <it>C. albobrunnoides </it>var. <it>violaceovelatus </it>were identical to one another and are treated as one species with a wider range of geographic distribution under <it>C. napus</it>.</p> <p>Conclusions</p> <p>Our results indicate that each of the <it>Calochroi </it>species has undergone a relatively independent evolutionary history, hypothesised as follows: 1) a widely distributed ancestral population of <it>C</it>. <it>arcuatorum </it>diverged into distinctive sympatric populations in the New World; 2) two divergent lineages in <it>C</it>. <it>elegantior </it>gave rise to the New World and Old World haplotypes, respectively; and 3) the low levels of genetic divergence within <it>C</it>. <it>aureofulvus </it>and <it>C</it>. <it>napus </it>may be the result of more recent demographic population expansions. The scenario of migration via the Bering Land Bridge provides the most probable explanation for contemporaneous disjunct geographic distributions of these species, but it does not offer an explanation for the low degree of genetic divergence between populations of <it>C. aureofulvus </it>and <it>C. napus</it>. Our findings are mostly consistent with the designation of New World allopatric populations as separate species from the European counterpart species <it>C. arcuatorum </it>and <it>C. elegantior</it>. We propose the synonymy of <it>C. albobrunnoides</it>, <it>C. albobrunnoides </it>var. <it>violaceovelatus </it>and <it>C. subpurpureophyllus </it>var. <it>sulphureovelatus </it>with <it>C. napus</it>. The results also reinforce previous observations that linked <it>C. arcuatorum </it>and <it>C. aureofulvus </it>displaying distributions in parts of North America and Europe. Interpretations of the population structure of these fungi suggest that host tree history has heavily influenced their modern distributions; however, the complex issues related to co-migration of these fungi with their tree hosts remain unclear at this time.</p

    Quantitative Epistasis Analysis and Pathway Inference from Genetic Interaction Data

    Get PDF
    Inferring regulatory and metabolic network models from quantitative genetic interaction data remains a major challenge in systems biology. Here, we present a novel quantitative model for interpreting epistasis within pathways responding to an external signal. The model provides the basis of an experimental method to determine the architecture of such pathways, and establishes a new set of rules to infer the order of genes within them. The method also allows the extraction of quantitative parameters enabling a new level of information to be added to genetic network models. It is applicable to any system where the impact of combinatorial loss-of-function mutations can be quantified with sufficient accuracy. We test the method by conducting a systematic analysis of a thoroughly characterized eukaryotic gene network, the galactose utilization pathway in Saccharomyces cerevisiae. For this purpose, we quantify the effects of single and double gene deletions on two phenotypic traits, fitness and reporter gene expression. We show that applying our method to fitness traits reveals the order of metabolic enzymes and the effects of accumulating metabolic intermediates. Conversely, the analysis of expression traits reveals the order of transcriptional regulatory genes, secondary regulatory signals and their relative strength. Strikingly, when the analyses of the two traits are combined, the method correctly infers ∼80% of the known relationships without any false positives

    Finding the sources of missing heritability in a yeast cross

    Get PDF
    For many traits, including susceptibility to common diseases in humans, causal loci uncovered by genetic mapping studies explain only a minority of the heritable contribution to trait variation. Multiple explanations for this "missing heritability" have been proposed. Here we use a large cross between two yeast strains to accurately estimate different sources of heritable variation for 46 quantitative traits and to detect underlying loci with high statistical power. We find that the detected loci explain nearly the entire additive contribution to heritable variation for the traits studied. We also show that the contribution to heritability of gene-gene interactions varies among traits, from near zero to 50%. Detected two-locus interactions explain only a minority of this contribution. These results substantially advance our understanding of the missing heritability problem and have important implications for future studies of complex and quantitative traits

    Histoplasma capsulatum proteome response to decreased iron availability

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A fundamental pathogenic feature of the fungus <it>Histoplasma capsulatum </it>is its ability to evade innate and adaptive immune defenses. Once ingested by macrophages the organism is faced with several hostile environmental conditions including iron limitation. <it>H. capsulatum </it>can establish a persistent state within the macrophage. A gap in knowledge exists because the identities and number of proteins regulated by the organism under host conditions has yet to be defined. Lack of such knowledge is an important problem because until these proteins are identified it is unlikely that they can be targeted as new and innovative treatment for histoplasmosis.</p> <p>Results</p> <p>To investigate the proteomic response by <it>H. capsulatum </it>to decreasing iron availability we have created <it>H. capsulatum </it>protein/genomic databases compatible with current mass spectrometric (MS) search engines. Databases were assembled from the <it>H. capsulatum </it>G217B strain genome using gene prediction programs and expressed sequence tag (EST) libraries. Searching these databases with MS data generated from two dimensional (2D) in-gel digestions of proteins resulted in over 50% more proteins identified compared to searching the publicly available fungal databases alone. Using 2D gel electrophoresis combined with statistical analysis we discovered 42 <it>H. capsulatum </it>proteins whose abundance was significantly modulated when iron concentrations were lowered. Altered proteins were identified by mass spectrometry and database searching to be involved in glycolysis, the tricarboxylic acid cycle, lysine metabolism, protein synthesis, and one protein sequence whose function was unknown.</p> <p>Conclusion</p> <p>We have created a bioinformatics platform for <it>H. capsulatum </it>and demonstrated the utility of a proteomic approach by identifying a shift in metabolism the organism utilizes to cope with the hostile conditions provided by the host. We have shown that enzyme transcripts regulated by other fungal pathogens in response to lowering iron availability are also regulated in <it>H. capsulatum </it>at the protein level. We also identified <it>H. capsulatum </it>proteins sensitive to iron level reductions which have yet to be connected to iron availability in other pathogens. These data also indicate the complexity of the response by <it>H. capsulatum </it>to nutritional deprivation. Finally, we demonstrate the importance of a strain specific gene/protein database for <it>H. capsulatum </it>proteomic analysis.</p

    PICS-Ord: unlimited coding of ambiguous regions by pairwise identity and cost scores ordination

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We present a novel method to encode ambiguously aligned regions in fixed multiple sequence alignments by 'Pairwise Identity and Cost Scores Ordination' (PICS-Ord). The method works via ordination of sequence identity or cost scores matrices by means of Principal Coordinates Analysis (PCoA). After identification of ambiguous regions, the method computes pairwise distances as sequence identities or cost scores, ordinates the resulting distance matrix by means of PCoA, and encodes the principal coordinates as ordered integers. Three biological and 100 simulated datasets were used to assess the performance of the new method.</p> <p>Results</p> <p>Including ambiguous regions coded by means of PICS-Ord increased topological accuracy, resolution, and bootstrap support in real biological and simulated datasets compared to the alternative of excluding such regions from the analysis a priori. In terms of accuracy, PICS-Ord performs equal to or better than previously available methods of ambiguous region coding (e.g., INAASE), with the advantage of a practically unlimited alignment size and increased analytical speed and the possibility of PICS-Ord scores to be analyzed together with DNA data in a partitioned maximum likelihood model.</p> <p>Conclusions</p> <p>Advantages of PICS-Ord over step matrix-based ambiguous region coding with INAASE include a practically unlimited number of OTUs and seamless integration of PICS-Ord codes into phylogenetic datasets, as well as the increased speed of phylogenetic analysis. Contrary to word- and frequency-based methods, PICS-Ord maintains the advantage of pairwise sequence alignment to derive distances, and the method is flexible with respect to the calculation of distance scores. In addition to distance and maximum parsimony, PICS-Ord codes can be analyzed in a Bayesian or maximum likelihood framework. RAxML (version 7.2.6 or higher that was developed for this study) allows up to 32-state ordered or unordered characters. A GTR, MK, or ORDERED model can be applied to analyse the PICS-Ord codes partition, with GTR performing slightly better than MK and ORDERED.</p> <p>Availability</p> <p>An implementation of the PICS-Ord algorithm is available from <url>http://scit.us/projects/ngila/wiki/PICS-Ord</url>. It requires both the statistical software, R <url>http://www.r-project.org</url> and the alignment software Ngila <url>http://scit.us/projects/ngila</url>.</p
    corecore