1,250 research outputs found

    Gene ontology based transfer learning for protein subcellular localization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as <it>GO</it>, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the <it>GO </it>terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology.</p> <p>Results</p> <p>In this paper, we propose a Gene Ontology Based Transfer Learning Model (<it>GO-TLM</it>) for large-scale protein subcellular localization. The model transfers the signature-based homologous <it>GO </it>terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false <it>GO </it>terms that are resulted from evolutionary divergence. We derive three <it>GO </it>kernels from the three aspects of gene ontology to measure the <it>GO </it>similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for protein subcellular localization. We evaluate <it>GO-TLM </it>performance against three baseline models: <it>MultiLoc, MultiLoc-GO </it>and <it>Euk-mPLoc </it>on the benchmark datasets the baseline models adopted. 5-fold cross validation experiments show that <it>GO-TLM </it>achieves substantial accuracy improvement against the baseline models: 80.38% against model <it>Euk-mPLoc </it>67.40% with <it>12.98% </it>substantial increase; 96.65% and 96.27% against model <it>MultiLoc-GO </it>89.60% and 89.60%, with <it>7.05% </it>and <it>6.67% </it>accuracy increase on dataset <it>MultiLoc plant </it>and dataset <it>MultiLoc animal</it>, respectively; 97.14%, 95.90% and 96.85% against model <it>MultiLoc-GO </it>83.70%, 90.10% and 85.70%, with accuracy increase <it>13.44%</it>, <it>5.8% </it>and <it>11.15% </it>on dataset <it>BaCelLoc plant</it>, dataset <it>BaCelLoc fungi </it>and dataset <it>BaCelLoc animal </it>respectively. For <it>BaCelLoc </it>independent sets, <it>GO-TLM </it>achieves 81.25%, 80.45% and 79.46% on dataset <it>BaCelLoc plant holdout</it>, dataset <it>BaCelLoc plant holdout </it>and dataset <it>BaCelLoc animal holdout</it>, respectively, as compared against baseline model <it>MultiLoc-GO </it>76%, 60.00% and 73.00%, with accuracy increase <it>5.25%</it>, <it>20.45% </it>and <it>6.46%</it>, respectively.</p> <p>Conclusions</p> <p>Since direct homology-based <it>GO </it>term transfer may be prone to introducing noise and outliers to the target protein, we design an explicitly weighted kernel learning system (called Gene Ontology Based Transfer Learning Model, <it>GO-TLM</it>) to transfer to the target protein the known knowledge about related homologous proteins, which can reduce the risk of outliers and share knowledge between homologous proteins, and thus achieve better predictive performance for protein subcellular localization. Cross validation and independent test experimental results show that the homology-based <it>GO </it>term transfer and explicitly weighing the <it>GO </it>kernels substantially improve the prediction performance.</p

    Shared probe design and existing microarray reanalysis using PICKY

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Large genomes contain families of highly similar genes that cannot be individually identified by microarray probes. This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method. Since gene annotations are updated more frequently than microarrays, another common issue facing microarray users is that existing microarrays must be routinely reanalyzed to determine probes that are still useful with respect to the updated annotations.</p> <p>Results</p> <p><smcaps>PICKY</smcaps> 2.0 can design shared probes for sets of genes that cannot be individually identified using unique probes. <smcaps>PICKY</smcaps> 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons. Therefore, <smcaps>PICKY</smcaps> does not sacrifice the quality of shared probes when choosing them. The latest <smcaps>PICKY</smcaps> 2.1 includes the new capability to reanalyze existing microarray probes against updated gene sets to determine probes that are still valid to use. In addition, more precise nonlinear salt effect estimates and other improvements are added, making <smcaps>PICKY</smcaps> 2.1 more versatile to microarray users.</p> <p>Conclusions</p> <p>Shared probes allow expressed gene family members to be detected; this capability is generally more desirable than not knowing anything about these genes. Shared probes also enable the design of cross-genome microarrays, which facilitate multiple species identification in environmental samples. The new nonlinear salt effect calculation significantly increases the precision of probes at a lower buffer salt concentration, and the probe reanalysis function improves existing microarray result interpretations.</p

    Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property

    Get PDF
    Given a regulatory pathway system consisting of a set of proteins, can we predict which pathway class it belongs to? Such a problem is closely related to the biological function of the pathway in cells and hence is quite fundamental and essential in systems biology and proteomics. This is also an extremely difficult and challenging problem due to its complexity. To address this problem, a novel approach was developed that can be used to predict query pathways among the following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental Information Processing”, (iv) “Cellular Processes”, (v) “Organismal Systems”, and (vi) “Human Diseases”. The prediction method was established trough the following procedures: (i) according to the general form of pseudo amino acid composition (PseAAC), each of the pathways concerned is formulated as a 5570-D (dimensional) vector; (ii) each of components in the 5570-D vector was derived by a series of feature extractions from the pathway system according to its graphic property, biochemical and physicochemical property, as well as functional property; (iii) the minimum redundancy maximum relevance (mRMR) method was adopted to operate the prediction. A cross-validation by the jackknife test on a benchmark dataset consisting of 146 regulatory pathways indicated that an overall success rate of 78.8% was achieved by our method in identifying query pathways among the above six classes, indicating the outcome is quite promising and encouraging. To the best of our knowledge, the current study represents the first effort in attempting to identity the type of a pathway system or its biological function. It is anticipated that our report may stimulate a series of follow-up investigations in this new and challenging area

    Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences.</p> <p>Results</p> <p>The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes.</p> <p>Conclusions</p> <p>The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at <url>http://biomine.ece.ualberta.ca/MODAS/</url>.</p

    Synergistic Anti-Tumor Effects of Combination of Photodynamic Therapy and Arsenic Compound in Cervical Cancer Cells: In Vivo and In Vitro Studies

    Get PDF
    The effects of As4O6 as adjuvant on photodynamic therapy (PDT) were studied. As4O6 is considered to have anticancer activity via several biological actions, such as free radical production and inhibition of VEGF expression. PDT or As4O6 significantly inhibited TC-1 cell proliferation in a dose-dependent manner (P<0.05) by MTT assay. The anti-proliferative effect of the combination treatment was significantly higher than in TC-1 cells treated with either photodynamic therapy or As4O6 alone (62.4 and 52.5% decrease compared to vehicle-only treated TC-1 cells, respectively, P<0.05). In addition, cell proliferation in combination of photodynamic therapy and As4O6 treatment significantly decreased by 77.4% (P<0.05). Cell survival pathway (Naip1, Tert and Aip1) and p53-dependent pathway (Bax, p21Cip1, Fas, Gadd45, IGFBP-3 and Mdm-2) were markedly increased by combination treatment of photodynamic therapy and As4O6. In addition, the immune response in the NEAT pathway (Ly-12, CD178 and IL-2) was also modulated after combination treatment, suggesting improved antitumor effects by controlling unwanted growth-stimulatory pathways. The combination effect apparently reflected concordance with in vitro data, in restricting tumor growth in vivo and in relation to some common signaling pathways to those observed in vitro. These findings suggest the benefit of combinatory treatment with photodynamic therapy and As4O6 for inhibition of cervical cancer cell growth

    HDAC Inhibition Decreases the Expression of EGFR in Colorectal Cancer Cells

    Get PDF
    Epidermal growth factor receptor (EGFR), a receptor tyrosine kinase which promotes cell proliferation and survival, is abnormally overexpressed in numerous tumors of epithelial origin, including colorectal cancer (CRC). EGFR monoclonal antibodies have been shown to increase the median survival and are approved for the treatment of colorectal cancer. Histone deacetylases (HDACs), frequently overexpressed in colorectal cancer and several malignancies, are another attractive targets for cancer therapy. Several inhibitors of HDACs (HDACi) are developed and exhibit powerful antitumor abilities. In this study, human colorectal cancer cells treated with HDACi exhibited reduced EGFR expression, thereby disturbed EGF-induced ERK and Akt phosphorylation. HDACi also decreased the expression of SGLT1, an active glucose transporter found to be stabilized by EGFR, and suppressed the glucose uptake of cancer cells. HDACi suppressed the transcription of EGFR and class I HDACs were proved to be involved in this event. Chromatin immunoprecipitation analysis showed that HDACi caused the dissociation of SP1, HDAC3 and CBP from EGFR promoter. Our data suggested that HDACi could serve as a single agent to block both EGFR and HDAC, and may bring more benefits to the development of CRC therapy

    The Use of Nanoscale Visible Light-Responsive Photocatalyst TiO2-Pt for the Elimination of Soil-Borne Pathogens

    Get PDF
    Exposure to the soil-borne pathogens Burkholderia pseudomallei and Burkholderia cenocepacia can lead to severe infections and even mortality. These pathogens exhibit a high resistance to antibiotic treatments. In addition, no licensed vaccine is currently available. A nanoscale platinum-containing titania photocatalyst (TiO2-Pt) has been shown to have a superior visible light-responsive photocatalytic ability to degrade chemical contaminants like nitrogen oxides. The antibacterial activity of the catalyst and its potential use in soil pathogen control were evaluated. Using the plating method, we found that TiO2-Pt exerts superior antibacterial performance against Escherichia coli compared to other commercially available and laboratory prepared ultraviolet/visible light-responsive titania photocatalysts. TiO2-Pt-mediated photocatalysis also affectively eliminates the soil-borne bacteria B. pseudomallei and B. cenocepacia. An air pouch infection mouse model further revealed that TiO2-Pt-mediated photocatalysis could reduce the pathogenicity of both strains of bacteria. Unexpectedly, water containing up to 10% w/v dissolved soil particles did not reduce the antibacterial potency of TiO2-Pt, suggesting that the TiO2-Pt photocatalyst is suitable for use in soil-contaminated environments. The TiO2-Pt photocatalyst exerted superior antibacterial activity against a broad spectrum of human pathogens, including B. pseudomallei and B. cenocepacia. Soil particles (<10% w/v) did not significantly reduce the antibacterial activity of TiO2-Pt in water. These findings suggest that the TiO2-Pt photocatalyst may have potential applications in the development of bactericides for soil-borne pathogens

    Search for Pair Production of Scalar Top Quarks Decaying to a tau Lepton and a b Quark in ppbar Collisions at sqrt{s}=1.96 TeV

    Get PDF
    We search for pair production of supersymmetric top quarks (~t_1), followed by R-parity violating decay ~t_1 -> tau b with a branching ratio beta, using 322 pb^-1 of ppbar collisions at sqrt{s}=1.96 TeV collected by the CDF II detector at Fermilab. Two candidate events pass our final selection criteria, consistent with the standard model expectation. We set upper limits on the cross section sigma(~t_1 ~tbar_1)*beta^2 as a function of the stop mass m(~t_1). Assuming beta=1, we set a 95% confidence level limit m(~t_1)>153 GeV/c^2. The limits are also applicable to the case of a third generation scalar leptoquark (LQ_3) decaying LQ_3 -> tau b.Comment: 7 pages, 2 eps figure

    Measurement of the Dipion Mass Spectrum in X(3872) -> J/Psi Pi+ Pi- Decays

    Get PDF
    We measure the dipion mass spectrum in X(3872)--> J/Psi Pi+ Pi- decays using 360 pb-1 of pbar-p collisions at 1.96 TeV collected with the CDF II detector. The spectrum is fit with predictions for odd C-parity (3S1, 1P1, and 3DJ) charmonia decaying to J/Psi Pi+ Pi-, as well as even C-parity states in which the pions are from Rho0 decay. The latter case also encompasses exotic interpretations, such as a D0-D*0Bar molecule. Only the 3S1 and J/Psi Rho hypotheses are compatible with our data. Since 3S1 is untenable on other grounds, decay via J/Psi Rho is favored, which implies C=+1 for the X(3872). Models for different J/Psi-Rho angular momenta L are considered. Flexibility in the models, especially the introduction of Rho-Omega interference, enable good descriptions of our data for both L=0 and 1.Comment: 7 pages, 4 figures -- Submitted to Phys. Rev. Let

    Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV

    Get PDF
    The performance of muon reconstruction, identification, and triggering in CMS has been studied using 40 inverse picobarns of data collected in pp collisions at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection criteria covering a wide range of physics analysis needs have been examined. For all considered selections, the efficiency to reconstruct and identify a muon with a transverse momentum pT larger than a few GeV is above 95% over the whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4, while the probability to misidentify a hadron as a muon is well below 1%. The efficiency to trigger on single muons with pT above a few GeV is higher than 90% over the full eta range, and typically substantially better. The overall momentum scale is measured to a precision of 0.2% with muons from Z decays. The transverse momentum resolution varies from 1% to 6% depending on pseudorapidity for muons with pT below 100 GeV and, using cosmic rays, it is shown to be better than 10% in the central region up to pT = 1 TeV. Observed distributions of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO
    • …
    corecore