52,082 research outputs found
Domain-mediated interactions for protein subfamily identification
Within a protein family, proteins with the same domain often exhibit different cellular functions, despite the shared evolutionary history and molecular function of the domain. We hypothesized that domain-mediated interactions (DMIs) may categorize a protein family into subfamilies because the diversified functions of a single domain often depend on interacting partners of domains. Here we systematically identified DMI subfamilies, in which proteins share domains with DMI partners, as well as with various functional and physical interaction networks in individual species. In humans, DMI subfamily members are associated with similar diseases, including cancers, and are frequently co-associated with the same diseases. DMI information relates to the functional and evolutionary subdivisions of human kinases. In yeast, DMI subfamilies contain proteins with similar phenotypic outcomes from specific chemical treatments. Therefore, the systematic investigation here provides insights into the diverse functions of subfamilies derived from a protein family with a link-centric approach and suggests a useful resource for annotating the functions and phenotypic outcomes of proteins.11Ysciescopu
Combined population dynamics and entropy modelling supports patient stratification in chronic myeloid leukemia
Modelling the parameters of multistep carcinogenesis is key for a better understanding of cancer
progression, biomarker identification and the design of individualized therapies. Using chronic
myeloid leukemia (CML) as a paradigm for hierarchical disease evolution we show that combined
population dynamic modelling and CML patient biopsy genomic analysis enables patient stratification
at unprecedented resolution. Linking CD34+ similarity as a disease progression marker to patientderived
gene expression entropy separated established CML progression stages and uncovered
additional heterogeneity within disease stages. Importantly, our patient data informed model enables
quantitative approximation of individual patients’ disease history within chronic phase (CP) and
significantly separates “early” from “late” CP. Our findings provide a novel rationale for personalized
and genome-informed disease progression risk assessment that is independent and complementary to
conventional measures of CML disease burden and prognosis
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
Recommended from our members
Gene Expression Meta-Analysis Reveals Concordance in Gene Activation, Pathway, and Cell-Type Enrichment in Dermatomyositis Target Tissues.
ObjectiveWe conducted a comprehensive gene expression meta-analysis in dermatomyositis (DM) muscle and skin tissues to identify shared disease-relevant genes and pathways across tissues.MethodsSix publicly available data sets from DM muscle and two from skin were identified. Meta-analysis was performed by first processing data sets individually then cross-study normalization and merging creating tissue-specific gene expression matrices for subsequent analysis. Complementary single-gene and network analyses using Significance Analysis of Microarrays (SAM) and Weighted Gene Co-expression Network Analysis (WGCNA) were conducted to identify genes significantly associated with DM. Cell-type enrichment was performed using xCell.ResultsThere were 544 differentially expressed genes (FC ≥ 1.3, q < 0.05) in muscle and 300 in skin. There were 94 shared upregulated genes across tissues enriched in type I and II interferon (IFN) signaling and major histocompatibility complex (MHC) class I antigen-processing pathways. In a network analysis, we identified eight significant gene modules in muscle and seven in skin. The most highly correlated modules were enriched in pathways consistent with the single-gene analysis. Additional pathways uncovered by WGCNA included T-cell activation and T-cell receptor signaling. In the cell-type enrichment analysis, both tissues were highly enriched in activated dendritic cells and M1 macrophages.ConclusionThere is striking similarity in gene expression across DM target tissues with enrichment of type I and II IFN pathways, MHC class I antigen-processing, T-cell activation, and antigen-presenting cells. These results suggest IFN-γ may contribute to the global IFN signature in DM, and altered auto-antigen presentation through the class I MHC pathway may be important in disease pathogenesis
Gene expression in large pedigrees: analytic approaches.
BackgroundWe currently have the ability to quantify transcript abundance of messenger RNA (mRNA), genome-wide, using microarray technologies. Analyzing genotype, phenotype and expression data from 20 pedigrees, the members of our Genetic Analysis Workshop (GAW) 19 gene expression group published 9 papers, tackling some timely and important problems and questions. To study the complexity and interrelationships of genetics and gene expression, we used established statistical tools, developed newer statistical tools, and developed and applied extensions to these tools.MethodsTo study gene expression correlations in the pedigree members (without incorporating genotype or trait data into the analysis), 2 papers used principal components analysis, weighted gene coexpression network analysis, meta-analyses, gene enrichment analyses, and linear mixed models. To explore the relationship between genetics and gene expression, 2 papers studied expression quantitative trait locus allelic heterogeneity through conditional association analyses, and epistasis through interaction analyses. A third paper assessed the feasibility of applying allele-specific binding to filter potential regulatory single-nucleotide polymorphisms (SNPs). Analytic approaches included linear mixed models based on measured genotypes in pedigrees, permutation tests, and covariance kernels. To incorporate both genotype and phenotype data with gene expression, 4 groups employed linear mixed models, nonparametric weighted U statistics, structural equation modeling, Bayesian unified frameworks, and multiple regression.Results and discussionRegarding the analysis of pedigree data, we found that gene expression is familial, indicating that at least 1 factor for pedigree membership or multiple factors for the degree of relationship should be included in analyses, and we developed a method to adjust for familiality prior to conducting weighted co-expression gene network analysis. For SNP association and conditional analyses, we found FaST-LMM (Factored Spectrally Transformed Linear Mixed Model) and SOLAR-MGA (Sequential Oligogenic Linkage Analysis Routines -Major Gene Analysis) have similar type 1 and type 2 errors and can be used almost interchangeably. To improve the power and precision of association tests, prior knowledge of DNase-I hypersensitivity sites or other relevant biological annotations can be incorporated into the analyses. On a biological level, eQTL (expression quantitative trait loci) are genetically complex, exhibiting both allelic heterogeneity and epistasis. Including both genotype and phenotype data together with measurements of gene expression was found to be generally advantageous in terms of generating improved levels of significance and in providing more interpretable biological models.ConclusionsPedigrees can be used to conduct analyses of and enhance gene expression studies
- …