125 research outputs found

    Semantic Similarity for Automatic Classification of Chemical Compounds

    Get PDF
    With the increasing amount of data made available in the chemical field, there is a strong need for systems capable of comparing and classifying chemical compounds in an efficient and effective way. The best approaches existing today are based on the structure-activity relationship premise, which states that biological activity of a molecule is strongly related to its structural or physicochemical properties. This work presents a novel approach to the automatic classification of chemical compounds by integrating semantic similarity with existing structural comparison methods. Our approach was assessed based on the Matthews Correlation Coefficient for the prediction, and achieved values of 0.810 when used as a prediction of blood-brain barrier permeability, 0.694 for P-glycoprotein substrate, and 0.673 for estrogen receptor binding activity. These results expose a significant improvement over the currently existing methods, whose best performances were 0.628, 0.591, and 0.647 respectively. It was demonstrated that the integration of semantic similarity is a feasible and effective way to improve existing chemical compound classification systems. Among other possible uses, this tool helps the study of the evolution of metabolic pathways, the study of the correlation of metabolic networks with properties of those networks, or the improvement of ontologies that represent chemical information

    Identification of Stage-Specific Breast Markers using Quantitative Proteomics

    Get PDF
    YesMatched healthy and diseased tissues from breast cancer patients were analyzed by quantitative proteomics. By comparing proteomic profiles of fibroadenoma (benign tumors, three patients), DCIS (noninvasive cancer, three patients), and invasive ductal carcinoma (four patients), we identified protein alterations that correlated with breast cancer progression. Three 8-plex iTRAQ experiments generated an average of 826 protein identifications, of which 402 were common. After excluding those originating from blood, 59 proteins were significantly changed in tumor compared with normal tissues, with the majority associated with invasive carcinomas. Bioinformatics analysis identified relationships between proteins in this subset including roles in redox regulation, lipid transport, protein folding, and proteasomal degradation, with a substantial number increased in expression due to Myc oncogene activation. Three target proteins, cofilin-1 and p23 (increased in invasive carcinoma) and membrane copper amine oxidase 3 (decreased in invasive carcinoma), were subjected to further validation. All three were observed in phenotype-specific breast cancer cell lines, normal (nontransformed) breast cell lines, and primary breast epithelial cells by Western blotting, but only cofilin-1 and p23 were detected by multiple reaction monitoring mass spectrometry analysis. All three proteins were detected by both analytical approaches in matched tissue biopsies emulating the response observed with proteomics analysis. Tissue microarray analysis (361 patients) indicated cofilin-1 staining positively correlating with tumor grade and p23 staining with ER positive status; both therefore merit further investigation as potential biomarkers.Cyprus Research Promotion Foundation, Yorkshire Cancer Researc

    Breast tumors from CHEK2 1100delC-mutation carriers: genomic landscape and clinical implications

    Get PDF
    Introduction: Checkpoint kinase 2 (CHEK2) is a moderate penetrance breast cancer risk gene, whose truncating mutation 1100delC increases the risk about twofold. We investigated gene copy-number aberrations and gene-expression profiles that are typical for breast tumors of CHEK2 1100delC-mutation carriers. Methods: In total, 126 breast tumor tissue specimens including 32 samples from patients carrying CHEK2 1100delC were studied in array-comparative genomic hybridization (aCGH) and gene-expression (GEX) experiments. After dimensionality reduction with CGHregions R package, CHEK2 1100delC-associated regions in the aCGH data were detected by the Wilcoxon rank-sum test. The linear model was fitted to GEX data with R package limma. Genes whose expression levels were associated with CHEK2 1100delC mutation were detected by the bayesian method. Results: We discovered four lost and three gained CHEK2 1100delC-related loci. These include losses of 1p13.3-31.3, 8p21.1-2, 8p23.1-2, and 17p12-13.1 as well as gains of 12q13.11-3, 16p13.3, and 19p13.3. Twenty-eight genes located on these regions showed differential expression between CHEK2 1100delC and other tumors, nominating them as candidates for CHEK2 1100delC-associated tumor-progression drivers. These included CLCA1 on 1p22 as well as CALCOCO1, SBEM, and LRP1 on 12q13. Altogether, 188 genes were differentially expressed between CHEK2 1100delC and other tumors. Of these, 144 had elevated and 44, reduced expression levels. Our results suggest the WNT pathway as a driver of tumorigenesis in breast tumors of CHEK2 1100delC-mutation carriers and a role for the olfactory receptor protein family in cancer progression. Differences in the expression of the 188 CHEK2 1100delC-associated genes divided breast tumor samples from three independent datasets into two groups that differed in their relapse-free survival time. Conclusions: We have shown that copy-number aberrations of certain genomic regions are associated with CHEK2 mutation 1100delC. On these regions, we identified potential drivers of CHEK2 1100delC-associated tumorigenesis, whose role in cancer progression is worth investigating. Furthermore, poorer survival related to the CHEK2 1100delC gene-expression signature highlights pathways that are likely to have a role in the development of metastatic disease in carriers of the CHEK2 1100delC mutation
    • …
    corecore