40 research outputs found

    Analysis of Genetic Interaction Maps Reveals Functional Pleiotropy

    Get PDF
    Epistatic or genetic interactions, representing the effects of mutations on the phenotypes caused by other mutations, can be very helpful for uncovering functional relationships between genes. Recently, the Epistasis Miniarray Profile (E-MAP) method has emerged as a powerful approach for identifying such interactions systematically. As part of this approach, hierarchical clustering is used to partition genes into groups on the basis of the similarity between their global interaction profiles. Here we present an original biclustering algorithm for identifying groups of functionally related genes from E-MAP data in a manner that allows individual genes to be assigned to more than one functional group. This enables investigation of the pleiotropic nature of gene function, a goal that cannot be achieved with hierarchical clustering. The performance of our algorithm is illustrated by applying it to two E-MAP datasets and an E-MAP-like in silico dataset for the yeast S. cerevisiae. In addition to identifying the majority of the functional modules reported in these studies, our algorithm uncovers many recently documented and novel multi-functional relationships between genes and gene groups

    Expanding the Landscape of Chromatin Modification (CM)-Related Functional Domains and Genes in Human

    Get PDF
    Chromatin modification (CM) plays a key role in regulating transcription, DNA replication, repair and recombination. However, our knowledge of these processes in humans remains very limited. Here we use computational approaches to study proteins and functional domains involved in CM in humans. We analyze the abundance and the pair-wise domain-domain co-occurrences of 25 well-documented CM domains in 5 model organisms: yeast, worm, fly, mouse and human. Results show that domains involved in histone methylation, DNA methylation, and histone variants are remarkably expanded in metazoan, reflecting the increased demand for cell type-specific gene regulation. We find that CM domains tend to co-occur with a limited number of partner domains and are hence not promiscuous. This property is exploited to identify 47 potentially novel CM domains, including 24 DNA-binding domains, whose role in CM has received little attention so far. Lastly, we use a consensus Machine Learning approach to predict 379 novel CM genes (coding for 329 proteins) in humans based on domain compositions. Several of these predictions are supported by very recent experimental studies and others are slated for experimental verification. Identification of novel CM genes and domains in humans will aid our understanding of fundamental epigenetic processes that are important for stem cell differentiation and cancer biology. Information on all the candidate CM domains and genes reported here is publicly available

    Markov clustering versus affinity propagation for the partitioning of protein interaction graphs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs.</p> <p>Results</p> <p>In this work we compare the performance of the Affinity Propagation (AP) and Markov Clustering (MCL) procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from <it>S. cervisiae </it>hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of <it>S. cerevisiae</it>, derived from genome scale purification experiments and to versions of this network in which varying proportions of the links have been randomly shuffled.</p> <p>Conclusion</p> <p>Our analysis shows that the MCL procedure is significantly more tolerant to noise and behaves more robustly than the AP algorithm. The advantage of MCL over AP is dramatic for unweighted protein interaction graphs, as AP displays severe convergence problems on the majority of the unweighted graph versions that we tested, whereas MCL continues to identify meaningful clusters, albeit fewer of them, as the level of noise in the graph increases. MCL thus remains the method of choice for identifying protein complexes from binary interaction networks.</p

    Genetic Interaction Maps in Escherichia coli Reveal Functional Crosstalk among Cell Envelope Biogenesis Pathways

    Get PDF
    As the interface between a microbe and its environment, the bacterial cell envelope has broad biological and clinical significance. While numerous biosynthesis genes and pathways have been identified and studied in isolation, how these intersect functionally to ensure envelope integrity during adaptive responses to environmental challenge remains unclear. To this end, we performed high-density synthetic genetic screens to generate quantitative functional association maps encompassing virtually the entire cell envelope biosynthetic machinery of Escherichia coli under both auxotrophic (rich medium) and prototrophic (minimal medium) culture conditions. The differential patterns of genetic interactions detected among >235,000 digenic mutant combinations tested reveal unexpected condition-specific functional crosstalk and genetic backup mechanisms that ensure stress-resistant envelope assembly and maintenance. These networks also provide insights into the global systems connectivity and dynamic functional reorganization of a universal bacterial structure that is both broadly conserved among eubacteria (including pathogens) and an important target

    Increased peri-ductal collagen micro-organization may contribute to raised mammographic density

    Get PDF
    BACKGROUND: High mammographic density is a therapeutically modifiable risk factor for breast cancer. Although mammographic density is correlated with the relative abundance of collagen-rich fibroglandular tissue, the causative mechanisms, associated structural remodelling and mechanical consequences remain poorly defined. In this study we have developed a new collaborative bedside-to-bench workflow to determine the relationship between mammographic density, collagen abundance and alignment, tissue stiffness and the expression of extracellular matrix organising proteins. METHODS: Mammographic density was assessed in 22 post-menopausal women (aged 54–66 y). A radiologist and a pathologist identified and excised regions of elevated non-cancerous X-ray density prior to laboratory characterization. Collagen abundance was determined by both Masson’s trichrome and Picrosirius red staining (which enhances collagen birefringence when viewed under polarised light). The structural specificity of these collagen visualisation methods was determined by comparing the relative birefringence and ultrastructure (visualised by atomic force microscopy) of unaligned collagen I fibrils in reconstituted gels with the highly aligned collagen fibrils in rat tail tendon. Localised collagen fibril organisation and stiffness was also evaluated in tissue sections by atomic force microscopy/spectroscopy and the abundance of key extracellular proteins was assessed using mass spectrometry. RESULTS: Mammographic density was positively correlated with the abundance of aligned periductal fibrils rather than with the abundance of amorphous collagen. Compared with matched tissue resected from the breasts of low mammographic density patients, the highly birefringent tissue in mammographically dense breasts was both significantly stiffer and characterised by large (>80 μm long) fibrillar collagen bundles. Subsequent proteomic analyses not only confirmed the absence of collagen fibrosis in high mammographic density tissue, but additionally identified the up-regulation of periostin and collagen XVI (regulators of collagen fibril structure and architecture) as potential mediators of localised mechanical stiffness. CONCLUSIONS: These preliminary data suggest that remodelling, and hence stiffening, of the existing stromal collagen microarchitecture promotes high mammographic density within the breast. In turn, this aberrant mechanical environment may trigger neoplasia-associated mechanotransduction pathways within the epithelial cell population. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13058-015-0664-2) contains supplementary material, which is available to authorized users

    Development and Validation of a Risk Score for Chronic Kidney Disease in HIV Infection Using Prospective Cohort Data from the D:A:D Study

    Get PDF
    Ristola M. on työryhmien DAD Study Grp ; Royal Free Hosp Clin Cohort ; INSIGHT Study Grp ; SMART Study Grp ; ESPRIT Study Grp jäsen.Background Chronic kidney disease (CKD) is a major health issue for HIV-positive individuals, associated with increased morbidity and mortality. Development and implementation of a risk score model for CKD would allow comparison of the risks and benefits of adding potentially nephrotoxic antiretrovirals to a treatment regimen and would identify those at greatest risk of CKD. The aims of this study were to develop a simple, externally validated, and widely applicable long-term risk score model for CKD in HIV-positive individuals that can guide decision making in clinical practice. Methods and Findings A total of 17,954 HIV-positive individuals from the Data Collection on Adverse Events of Anti-HIV Drugs (D:A:D) study with >= 3 estimated glomerular filtration rate (eGFR) values after 1 January 2004 were included. Baseline was defined as the first eGFR > 60 ml/min/1.73 m2 after 1 January 2004; individuals with exposure to tenofovir, atazanavir, atazanavir/ritonavir, lopinavir/ritonavir, other boosted protease inhibitors before baseline were excluded. CKD was defined as confirmed (>3 mo apart) eGFR In the D:A:D study, 641 individuals developed CKD during 103,185 person-years of follow-up (PYFU; incidence 6.2/1,000 PYFU, 95% CI 5.7-6.7; median follow-up 6.1 y, range 0.3-9.1 y). Older age, intravenous drug use, hepatitis C coinfection, lower baseline eGFR, female gender, lower CD4 count nadir, hypertension, diabetes, and cardiovascular disease (CVD) predicted CKD. The adjusted incidence rate ratios of these nine categorical variables were scaled and summed to create the risk score. The median risk score at baseline was -2 (interquartile range -4 to 2). There was a 1: 393 chance of developing CKD in the next 5 y in the low risk group (risk score = 5, 505 events), respectively. Number needed to harm (NNTH) at 5 y when starting unboosted atazanavir or lopinavir/ritonavir among those with a low risk score was 1,702 (95% CI 1,166-3,367); NNTH was 202 (95% CI 159-278) and 21 (95% CI 19-23), respectively, for those with a medium and high risk score. NNTH was 739 (95% CI 506-1462), 88 (95% CI 69-121), and 9 (95% CI 8-10) for those with a low, medium, and high risk score, respectively, starting tenofovir, atazanavir/ritonavir, or another boosted protease inhibitor. The Royal Free Hospital Clinic Cohort included 2,548 individuals, of whom 94 individuals developed CKD (3.7%) during 18,376 PYFU (median follow-up 7.4 y, range 0.3-12.7 y). Of 2,013 individuals included from the SMART/ESPRIT control arms, 32 individuals developed CKD (1.6%) during 8,452 PYFU (median follow-up 4.1 y, range 0.6-8.1 y). External validation showed that the risk score predicted well in these cohorts. Limitations of this study included limited data on race and no information on proteinuria. Conclusions Both traditional and HIV-related risk factors were predictive of CKD. These factors were used to develop a risk score for CKD in HIV infection, externally validated, that has direct clinical relevance for patients and clinicians to weigh the benefits of certain antiretrovirals against the risk of CKD and to identify those at greatest risk of CKD.Peer reviewe

    A Global Mapping of Protein Complexes in S. cerevisiae

    No full text
    Systematic identification of protein-protein interactions (PPIs) on a genome scale has become an important focus of biology, as the majority of cellular functions are mediated by these interactions. Several high throughput experimental techniques have emerged as effective tools for querying the protein-protein interactome and can be broadly categorized into those that detect direct, physical protein-protein interactions and those that yield information on the composition of protein complexes. Tandem affinity purification followed by mass spectrometry (TAP/MS) is an example of the latter that identifies proteins that co-purify with a given tagged query (bait) protein. Though TAP/MS enables these co-complexed associations to be identified on a proteome scale, the amount of data generated by the systematic querying of thousands of proteins can be extremely large. Data from multiple purifications are combined to form a very large network of proteins linked by edges whenever the corresponding pairs might form an association. Only a fraction of these pairwise associations correspond to physical interactions, however, and further computational analysis is necessary to filter out non-specific associations. This thesis examines how differing computational procedures for the analysis of TAP/MS data can affect the final PPI network, and outlines a procedure to accurately identify protein complexes from data consolidated from multiple proteome-scale TAP/MS experiments in the budding yeast \textit{Saccharomyces cerevisiae}. In collaboration with the Greenblatt and Emili laboratories at the University of Toronto, this methodology was extended to yeast membrane proteins to derive a comprehensive network of 13,343 PPIs and 720 protein complexes spanning both membrane and non-membrane proteins.Ph
    corecore