60 research outputs found

    A Genome-Wide Study of DNA Methylation Patterns and Gene Expression Levels in Multiple Human and Chimpanzee Tissues

    Get PDF
    The modification of DNA by methylation is an important epigenetic mechanism that affects the spatial and temporal regulation of gene expression. Methylation patterns have been described in many contexts within and across a range of species. However, the extent to which changes in methylation might underlie inter-species differences in gene regulation, in particular between humans and other primates, has not yet been studied. To this end, we studied DNA methylation patterns in livers, hearts, and kidneys from multiple humans and chimpanzees, using tissue samples for which genome-wide gene expression data were also available. Using the multi-species gene expression and methylation data for 7,723 genes, we were able to study the role of promoter DNA methylation in the evolution of gene regulation across tissues and species. We found that inter-tissue methylation patterns are often conserved between humans and chimpanzees. However, we also found a large number of gene expression differences between species that might be explained, at least in part, by corresponding differences in methylation levels. In particular, we estimate that, in the tissues we studied, inter-species differences in promoter methylation might underlie as much as 12%–18% of differences in gene expression levels between humans and chimpanzees

    Extreme Evolutionary Disparities Seen in Positive Selection across Seven Complex Diseases

    Get PDF
    Positive selection is known to occur when the environment that an organism inhabits is suddenly altered, as is the case across recent human history. Genome-wide association studies (GWASs) have successfully illuminated disease-associated variation. However, whether human evolution is heading towards or away from disease susceptibility in general remains an open question. The genetic-basis of common complex disease may partially be caused by positive selection events, which simultaneously increased fitness and susceptibility to disease. We analyze seven diseases studied by the Wellcome Trust Case Control Consortium to compare evidence for selection at every locus associated with disease. We take a large set of the most strongly associated SNPs in each GWA study in order to capture more hidden associations at the cost of introducing false positives into our analysis. We then search for signs of positive selection in this inclusive set of SNPs. There are striking differences between the seven studied diseases. We find alleles increasing susceptibility to Type 1 Diabetes (T1D), Rheumatoid Arthritis (RA), and Crohn's Disease (CD) underwent recent positive selection. There is more selection in alleles increasing, rather than decreasing, susceptibility to T1D. In the 80 SNPs most associated with T1D (p-value <7.01×10−5) showing strong signs of positive selection, 58 alleles associated with disease susceptibility show signs of positive selection, while only 22 associated with disease protection show signs of positive selection. Alleles increasing susceptibility to RA are under selection as well. In contrast, selection in SNPs associated with CD favors protective alleles. These results inform the current understanding of disease etiology, shed light on potential benefits associated with the genetic-basis of disease, and aid in the efforts to identify causal genetic factors underlying complex disease

    A critical discussion of the physics of wood–water interactions

    Get PDF

    The LabelHash algorithm for substructure matching

    Get PDF
    Background: There is an increasing number of proteins with known structure but unknown function. Determining their function would have a significant impact on understanding diseases and designing new therapeutics. However, experimental protein function determination is expensive and very time-consuming. Computational methods can facilitate function determination by identifying proteins that have high structural and chemical similarity. Results: We present LabelHash, a novel algorithm for matching substructural motifs to large collections of protein structures. The algorithm consists of two phases. In the first phase the proteins are preprocessed in a fashion that allows for instant lookup of partial matches to any motif. In the second phase, partial matches for a given motif are expanded to complete matches. The general applicability of the algorithm is demonstrated with three different case studies. First, we show that we can accurately identify members of the enolase superfamily with a single motif. Next, we demonstrate how LabelHash can complement SOIPPA, an algorithm for motif identification and pairwise substructure alignment. Finally, a large collection of Catalytic Site Atlas motifs is used to benchmark the performance of the algorithm. LabelHash runs very efficiently in parallel; matching a motif against all proteins in the 95 % sequence identity filtered non-redundant Protein Data Bank typically takes no more than a few minutes. The LabelHash algorithm is available through a web server and as a suite of standalone programs a

    clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present <it>clusterMaker</it>, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. <it>clusterMaker </it>is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL.</p> <p>Results</p> <p>Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast <it>Saccharomyces cerevisiae</it>; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section.</p> <p>Conclusions</p> <p>The Cytoscape plugin <it>clusterMaker </it>provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the <it>clusterMaker </it>plugin. <it>clusterMaker </it>is available via the Cytoscape plugin manager.</p

    SmCL3, a Gastrodermal Cysteine Protease of the Human Blood Fluke Schistosoma mansoni

    Get PDF
    Parasitic infection caused by blood flukes of the genus Schistosoma is a major global health problem. More than 200 million people are infected. Identifying and characterizing the constituent enzymes of the parasite's biochemical pathways should reveal opportunities for developing new therapies (i.e., vaccines, drugs). Schistosomes feed on host blood, and a number of proteolytic enzymes (proteases) contribute to this process. We have identified and characterized a new protease, SmCL3 (for Schistosoma mansoni cathepsin L3), that is found within the gut tissue of the parasite. We have employed various biochemical and molecular biological methods and sequence similarity analyses to characterize SmCL3 and obtain insights into its possible functions in the parasite, as well as its evolutionary position among cathepsin L proteases in general. SmCL3 hydrolyzes major host blood proteins (serum albumin and hemoglobin) and is expressed in parasite life stages infecting the mammalian host. Enzyme substrate specificity detected by positional scanning-synthetic combinatorial library was confirmed by molecular modeling. A sequence analysis placed SmCL3 to the cluster of other cathepsins L in accordance with previous phylogenetic analyses

    Covalent Docking Predicts Substrates for Haloalkanoate Dehalogenase Superfamily Phosphatases

    No full text
    Enzyme function prediction remains an important open problem. Though structure-based modeling, such as metabolite docking, can identify substrates of some enzymes, it is ill-suited to reactions that progress through a covalent intermediate. Here we investigated the ability of covalent docking to identify substrates that pass through such a covalent intermediate, focusing particularly on the haloalkanoate dehalogenase superfamily. In retrospective assessments, covalent docking recapitulated substrate binding modes of known cocrystal structures and identified experimental substrates from a set of putative phosphorylated metabolites. In comparison, noncovalent docking of high-energy intermediates yielded nonproductive poses. In prospective predictions against seven enzymes, a substrate was identified for five. For one of those cases, a covalent docking prediction, confirmed by empirical screening, and combined with genomic context analysis, suggested the identity of the enzyme that catalyzes the orphan phosphatase reaction in the riboflavin biosynthetic pathway of Bacteroides
    corecore