102 research outputs found
Multi-task Deep Neural Networks in Automated Protein Function Prediction
In recent years, deep learning algorithms have outperformed the state-of-the
art methods in several areas thanks to the efficient methods for training and
for preventing overfitting, advancement in computer hardware, the availability
of vast amount data. The high performance of multi-task deep neural networks in
drug discovery has attracted the attention to deep learning algorithms in
bioinformatics area. Here, we proposed a hierarchical multi-task deep neural
network architecture based on Gene Ontology (GO) terms as a solution to protein
function prediction problem and investigated various aspects of the proposed
architecture by performing several experiments. First, we showed that there is
a positive correlation between performance of the system and the size of
training datasets. Second, we investigated whether the level of GO terms on GO
hierarchy related to their performance. We showed that there is no relation
between the depth of GO terms on GO hierarchy and their performance. In
addition, we included all annotations to the training of a set of GO terms to
investigate whether including noisy data to the training datasets change the
performance of the system. The results showed that including less reliable
annotations in training of deep neural networks increased the performance of
the low performed GO terms, significantly. We evaluated the performance of the
system using hierarchical evaluation method. Mathews correlation coefficient
was calculated as 0.75, 0.49 and 0.63 for molecular function, biological
process and cellular component categories, respectively. We showed that deep
learning algorithms have a great potential in protein function prediction area.
We plan to further improve the DEEPred by including other types of annotations
from various biological data sources. We plan to construct DEEPred as an open
access online tool.Comment: 19 pages, 4 figures, 4 table
A signal transduction score flow algorithm for cyclic cellular pathway analysis, which combines transcriptome and ChIP-seq data
Determination of cell signalling behaviour is crucial for understanding the physiological response to a specific stimulus or drug treatment. Current approaches for large-scale data analysis do not effectively incorporate critical topological information provided by the signalling network. We herein describe a novel model- and data-driven hybrid approach, or signal transduction score flow algorithm, which allows quantitative visualization of cyclic cell signalling pathways that lead to ultimate cell responses such as survival, migration or death. This score flow algorithm translates signalling pathways as a directed graph and maps experimental data, including negative and positive feedbacks, onto gene nodes as scores, which then computationally traverse the signalling pathway until a pre-defined biological target response is attained. Initially, experimental data-driven enrichment scores of the genes were computed in a pathway, then a heuristic approach was applied using the gene score partition as a solution for protein node stoichiometry during dynamic scoring of the pathway of interest. Incorporation of a score partition during the signal flow and cyclic feedback loops in the signalling pathway significantly improves the usefulness of this model, as compared to other approaches. Evaluation of the score flow algorithm using both transcriptome and ChIP-seq data-generated signalling pathways showed good correlation with expected cellular behaviour on both KEGG and manually generated pathways. Implementation of the algorithm as a Cytoscape plug-in allows interactive visualization and analysis of KEGG pathways as well as user-generated and curated Cytoscape pathways. Moreover, the algorithm accurately predicts gene-level and global impacts of single or multiple in silico gene knockouts.Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG-geförderten) Allianz- bzw. Nationallizenz frei zugänglich
Synthesis of novel indole-isoxazole hybrids and evaluation of their cytotoxic activities on hepatocellular carcinoma cell lines
Background Liver cancer is predicted to be the sixth most diagnosed cancer globally and fourth leading cause of cancer deaths. In this study, a series of indole-3-isoxazole-5-carboxamide derivatives were designed, synthesized, and evaluated for their anticancer activities. The chemical structures of these of final compounds and intermediates were characterized by using IR, HRMS, H-1-NMR and C-13-NMR spectroscopy and element analysis. Results The cytotoxic activity was performed against Huh7, MCF7 and HCT116 cancer cell lines using sulforhodamine B assay. Some compounds showed potent anticancer activities and three of them were chosen for further evaluation on liver cancer cell lines based on SRB assay and real-time cell growth tracking analysis. Compounds were shown to cause arrest in the G0/G1 phase in Huh7 cells and caused a significant decrease in CDK4 levels. A good correlation was obtained between the theoretical predictions of bioavailability using Molinspiration calculation, Lipinski's rule of five, and experimental verification. These investigations reveal that indole-isoxazole hybrid system have the potential for the development of novel anticancer agents. Conclusions This study has provided data that will form the basis of further studies that aim to optimize both the design and synthesis of novel compounds that have higher anticancer activities
Identification of relative protein bands in Polyacrylamide Gel Electrophoresis (PAGE) using multiresolution snake algorithm
Polyacrylamide Gel Electrophoresis (PAGE) is one of the most widely used techniques in protein research. In the protein purification process, it is important to determine the efficiency of each purification step in terms of percentage of protein of interest found in the protein mixture. This study provides a rapid and reliable way to determine this percentage. The region of interest containing the protein is detected using the snake algorithm. The iterative snake algorithm is implemented in a multiresolutional framework. The snake is initialized on a low resolution image. Then, the final position of the snake at low resolution is used as the initial position in the higher resolution image. Finally, tile area of the protein is estimated as the area enclosed by the final position of the snake
Protein domain-based prediction of drug/compound–target interactions and experimental validation on LIM kinases
Predictive approaches such as virtual screening have been used in drug discovery with the objective of reducing developmental time and costs. Current machine learning and network-based approaches have issues related to generalization, usability, or model interpretability, especially due to the complexity of target proteins’ structure/function, and bias in system training datasets. Here, we propose a new method “DRUIDom” (DRUg Interacting Domain prediction) to identify bio-interactions between drug candidate compounds and targets by utilizing the domain modularity of proteins, to overcome problems associated with current approaches. DRUIDom is composed of two methodological steps. First, ligands/compounds are statistically mapped to structural domains of their target proteins, with the aim of identifying their interactions. As such, other proteins containing the same mapped domain or domain pair become new candidate targets for the corresponding compounds. Next, a million-scale dataset of small molecule compounds, including those mapped to domains in the previous step, are clustered based on their molecular similarities, and their domain associations are propagated to other compounds within the same clusters. Experimentally verified bioactivity data points, obtained from public databases, are meticulously filtered to construct datasets of active/interacting and inactive/non-interacting drug/compound–target pairs (~2.9M data points), and used as training data for calculating parameters of compound–domain mappings, which led to 27,032 high-confidence associations between 250 domains and 8,165 compounds, and a finalized output of ~5 million new compound–protein interactions. DRUIDom is experimentally validated by syntheses and bioactivity analyses of compounds predicted to target LIM-kinase proteins, which play critical roles in the regulation of cell motility, cell cycle progression, and differentiation through actin filament dynamics. We showed that LIMK-inhibitor-2 and its derivatives significantly block the cancer cell migration through inhibition of LIMK phosphorylation and the downstream protein cofilin. One of the derivative compounds (LIMKi-2d) was identified as a promising candidate due to its action on resistant Mahlavu liver cancer cells. The results demonstrated that DRUIDom can be exploited to identify drug candidate compounds for intended targets and to predict new target proteins based on the defined compound–domain relationships. Datasets, results, and the source code of DRUIDom are fully-available at: https://github.com/cansyl/DRUIDom
CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations
Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive and integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource. CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database. CROssBAR is enriched with the deep-learning-based prediction of relationships between numerous data entries, which is followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. These complex sets of entities and relationships are displayed to users via easy-tointerpret, interactive knowledge graphs within an open-access service. CROssBAR knowledge graphs incorporate relevant genes-proteins, molecular interactions, pathways, phenotypes, diseases, as well as known/predicted drugs and bioactive compounds, and they are constructed on-the-fly based on simple non-programmatic user queries. These intensely processed heterogeneous networks are expected to aid systems-level research, especially to infer biological mechanisms in relation to genes, proteins, their ligands, and diseases
Identification of an mRNA isoform switch for HNRNPA1 in breast cancers.
Roles of HNRNPA1 are beginning to emerge in cancers; however, mechanisms causing deregulation of HNRNPA1 function remain elusive. Here, we describe an isoform switch between the 3′-UTR isoforms of HNRNPA1 in breast cancers. We show that the dominantly expressed isoform in mammary tissue has a short half-life. In breast cancers, this isoform is downregulated in favor of a stable isoform. The stable isoform is expressed more in breast cancers, and more HNRNPA1 protein is synthesized from this isoform. High HNRNPA1 protein levels correlate with poor survival in patients. In support of this, silencing of HNRNPA1 causes a reversal in neoplastic phenotypes, including proliferation, clonogenic potential, migration, and invasion. In addition, silencing of HNRNPA1 results in the downregulation of microRNAs that map to intragenic regions. Among these miRNAs, miR-21 is known for its transcriptional upregulation in breast and numerous other cancers. Altogether, the cancer-specifc isoform switch we describe here for HNRNPA1 emphasizes the need to study gene expression at the isoform level in cancers to identify novel cases of oncogene activation
Loss of heme oxygenase 2 causes reduced expression of genes in cardiac muscle development and contractility and leads to cardiomyopathy in mice
Obstructive sleep apnea (OSA) is a common breathing disorder that affects a significant portion of the adult population. In addition to causing excessive daytime sleepiness and neurocognitive effects, OSA is an independent risk factor for cardiovascular disease; however, the underlying mechanisms are not completely understood. Using exposure to intermittent hypoxia (IH) to mimic OSA, we have recently reported that mice exposed to IH exhibit endothelial cell (EC) activation, which is an early process preceding the development of cardiovascular disease. Although widely used, IH models have several limitations such as the severity of hypoxia, which does not occur in most patients with OSA. Recent studies reported that mice with deletion of hemeoxygenase 2 (Hmox2-/-), which plays a key role in oxygen sensing in the carotid body, exhibit spontaneous apneas during sleep and elevated levels of catecholamines. Here, using RNA-sequencing we investigated the transcriptomic changes in aortic ECs and heart tissue to understand the changes that occur in Hmox2-/- mice. In addition, we evaluated cardiac structure, function, and electrical properties by using echocardiogram and electrocardiogram in these mice. We found that Hmox2-/- mice exhibited aortic EC activation. Transcriptomic analysis in aortic ECs showed differentially expressed genes enriched in blood coagulation, cell adhesion, cellular respiration and cardiac muscle development and contraction. Similarly, transcriptomic analysis in heart tissue showed a differentially expressed gene set enriched in mitochondrial translation, oxidative phosphorylation and cardiac muscle development. Analysis of transcriptomic data from aortic ECs and heart tissue showed loss of Hmox2 gene might have common cellular network footprints on aortic endothelial cells and heart tissue. Echocardiographic evaluation showed that Hmox2-/- mice develop progressive dilated cardiomyopathy and conduction abnormalities compared to Hmox2+/+ mice. In conclusion, we found that Hmox2-/- mice, which spontaneously develop apneas exhibit EC activation and transcriptomic and functional changes consistent with heart failure
- …