35 research outputs found
Prediction of Kinase-Substrate Associations Using The Functional Landscape of Kinases and Phosphorylation Sites
Protein phosphorylation is a key post-translational modification that plays a central role in many cellular processes. With recent advances in biotechnology, thousands of phosphorylated sites can be identified and quantified in a given sample, enabling proteome-wide screening of cellular signaling. However, for most (\u3e 90%) of the phosphorylation sites that are identified in these experiments, the kinase(s) that target these sites are unknown. To broadly utilize available structural, functional, evolutionary, and contextual information in predicting kinase-substrate associations (KSAs), we develop a network-based machine learning framework. Our framework integrates a multitude of data sources to characterize the landscape of functional relationships and associations among phosphosites and kinases. To construct a phosphosite-phosphosite association network, we use sequence similarity, shared biological pathways, co-evolution, co-occurrence, and co-phosphorylation of phosphosites across different biological states. To construct a kinase-kinase association network, we integrate protein-protein interactions, shared biological pathways, and membership in common kinase families. We use node embeddings computed from these heterogeneous networks to train machine learning models for predicting kinase-substrate associations. Our systematic computational experiments using the PhosphositePLUS database shows that the resulting algorithm, NetKSA, outperforms two state-of-the-art algorithms, including KinomeXplorer and LinkPhinder, in overall KSA prediction. By stratifying the ranking of kinases, NetKSA also enables annotation of phosphosites that are targeted by relatively less-studied kinases. Availability: The code and data are available at compbio.case.edu/NetKSA/
Divergent Directionality of Immune Cell-Specific Protein Expression between Bipolar Lithium Responders and Non-Responders Revealed by Enhanced Flow Cytometry
Background and Objectives: There is no biomarker to predict lithium response. This study used CellPrint™ enhanced flow cytometry to study 28 proteins representing a spectrum of cellular pathways in monocytes and CD4+ lymphocytes before and after lithium treatment in patients with bipolar disorder (BD). Materials and Methods: Symptomatic patients with BD type I or II received lithium (serum level ≥ 0.6 mEq/L) for 16 weeks. Patients were assessed with standard rating scales and divided into two groups, responders (≥50% improvement from baseline) and non-responders. Twenty-eight intracellular proteins in CD4+ lymphocytes and monocytes were analyzed with CellPrint™, an enhanced flow cytometry procedure. Data were analyzed for differences in protein expression levels. Results: The intent-to-treat sample included 13 lithium-responders (12 blood samples before treatment and 9 after treatment) and 11 lithium-non-responders (11 blood samples before treatment and 4 after treatment). No significant differences in expression between the groups was observed prior to lithium treatment. After treatment, the majority of analytes increased expression in responders and decreased expression in non-responders. Significant increases were seen for PDEB4 and NR3C1 in responders. A significant decrease was seen for NR3C1 in non-responders. Conclusions: Lithium induced divergent directionality of protein expression depending on the whether the patient was a responder or non-responder, elucidating molecular characteristics of lithium responsiveness. A subsequent study with a larger sample size is warranted
Identification of relative protein bands in Polyacrylamide Gel Electrophoresis (PAGE) using multiresolution snake algorithm
Polyacrylamide Gel Electrophoresis (PAGE) is one of the most widely used techniques in protein research. In the protein purification process, it is important to determine the efficiency of each purification step in terms of percentage of protein of interest found in the protein mixture. This study provides a rapid and reliable way to determine this percentage. The region of interest containing the protein is detected using the snake algorithm. The iterative snake algorithm is implemented in a multiresolutional framework. The snake is initialized on a low resolution image. Then, the final position of the snake at low resolution is used as the initial position in the higher resolution image. Finally, tile area of the protein is estimated as the area enclosed by the final position of the snake
ELF3 is an antagonist of oncogenic-signalling-induced expression of EMT-TF ZEB1
Background: Epithelial-to-mesenchymal transition (EMT) is a key step in the transformation of epithelial cells into migratory and invasive tumour cells. Intricate positive and negative regulatory processes regulate EMT. Many oncogenic signalling pathways can induce EMT, but the specific mechanisms of how this occurs, and how this process is controlled are not fully understood.
Methods: RNA-Seq analysis, computational analysis of protein networks and large-scale cancer genomics datasets were used to identify ELF3 as a negative regulator of the expression of EMT markers. Western blotting coupled to siRNA as well as analysis of tumour/normal colorectal cancer panels was used to investigate the expression and function of ELF3.
Results: RNA-Seq analysis of colorectal cancer cells expressing mutant and wild-type β-catenin and analysis of colorectal cancer cells expressing inducible mutant RAS showed that ELF3 expression is reduced in response to oncogenic signalling and antagonizes Wnt and RAS oncogenic signalling pathways. Analysis of gene-expression patterns across The Cancer Genome Atlas (TCGA) and protein localization in colorectal cancer tumour panels showed that ELF3 expression is anti-correlated with β-catenin and markers of EMT and correlates with better clinical prognosis.
Conclusions: ELF3 is a negative regulator of the EMT transcription factor (EMT-TF) ZEB1 through its function as an antagonist of oncogenic signalling
Circulating microbial content in myeloid malignancy patients is associated with disease subtypes and patient outcomes
Although recent work has described the microbiome in solid tumors, microbial content in hematological malignancies is not well-characterized. Here we analyze existing deep DNA sequence data from the blood and bone marrow of 1870 patients with myeloid malignancies, along with healthy controls, for bacterial, fungal, and viral content. After strict quality filtering, we find evidence for dysbiosis in disease cases, and distinct microbial signatures among disease subtypes. We also find that microbial content is associated with host gene mutations and with myeloblast cell percentages. In patients with low-risk myelodysplastic syndrome, we provide evidence that Epstein-Barr virus status refines risk stratification into more precise categories than the current standard. Motivated by these observations, we construct machine-learning classifiers that can discriminate among disease subtypes based solely on bacterial content. Our study highlights the association between the circulating microbiome and patient outcome, and its relationship with disease subtype
Iterative-Improvement-Based Declustering Heuristics For Multi-Disk Databases
Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial K-way declustering by recursively bipartitioning the data set, then applies multi-way refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic data sets show that the proposed method provides a significant performance improvement compared with the state-of-the-art declustering strategy based on similarity-graph partitioning. Author Keywords: Parallel database system
Weighted Matrix Ordering And Parallel Banded Preconditioners For Iterative Linear System Solvers
The emergence of multicore architectures and highly scalable platforms motivates the development of novel algorithms and techniques that emphasize concurrency and are tolerant of deep memory hierarchies, as opposed to minimizing raw FLOP counts. While direct solvers are reliable, they are often slow and memory-intensive for large problems. Iterative solvers, on the other hand, are more efficient but, in the absence of robust preconditioners, lack reliability. While preconditioners based on incomplete factorizations ( whenever they exist) are effective for many problems, their parallel scalability is generally limited. In this paper, we advocate the use of banded preconditioners instead and introduce a reordering strategy that enables their extraction. In contrast to traditional bandwidth reduction techniques, our reordering strategy takes into account the magnitude of the matrix entries, bringing the heaviest elements closer to the diagonal, thus enabling the use of banded preconditioners. When used with effective banded solvers-in our case, the Spike solver-we show that banded preconditioners (i) are more robust compared to the broad class of incomplete factorization-based preconditioners, (ii) deliver higher processor performance, resulting in faster time to solution, and (iii) scale to larger parallel configurations. We demonstrate these results experimentally on a large class of problems selected from diverse application domains