6 research outputs found

    Bayesian nonparametric clusterings in relational and high-dimensional settings with applications in bioinformatics.

    Get PDF
    Recent advances in high throughput methodologies offer researchers the ability to understand complex systems via high dimensional and multi-relational data. One example is the realm of molecular biology where disparate data (such as gene sequence, gene expression, and interaction information) are available for various snapshots of biological systems. This type of high dimensional and multirelational data allows for unprecedented detailed analysis, but also presents challenges in accounting for all the variability. High dimensional data often has a multitude of underlying relationships, each represented by a separate clustering structure, where the number of structures is typically unknown a priori. To address the challenges faced by traditional clustering methods on high dimensional and multirelational data, we developed three feature selection and cross-clustering methods: 1) infinite relational model with feature selection (FIRM) which incorporates the rich information of multirelational data; 2) Bayesian Hierarchical Cross-Clustering (BHCC), a deterministic approximation to Cross Dirichlet Process mixture (CDPM) and to cross-clustering; and 3) randomized approximation (RBHCC), based on a truncated hierarchy. An extension of BHCC, Bayesian Congruence Measuring (BCM), is proposed to measure incongruence between genes and to identify sets of congruent loci with identical evolutionary histories. We adapt our BHCC algorithm to the inference of BCM, where the intended structure of each view (congruent loci) represents consistent evolutionary processes. We consider an application of FIRM on categorizing mRNA and microRNA. The model uses latent structures to encode the expression pattern and the gene ontology annotations. We also apply FIRM to recover the categories of ligands and proteins, and to predict unknown drug-target interactions, where latent categorization structure encodes drug-target interaction, chemical compound similarity, and amino acid sequence similarity. BHCC and RBHCC are shown to have improved predictive performance (both in terms of cluster membership and missing value prediction) compared to traditional clustering methods. Our results suggest that these novel approaches to integrating multi-relational information have a promising future in the biological sciences where incorporating data related to varying features is often regarded as a daunting task

    Targeting protein kinases to manage or prevent Alzheimer’s disease

    Get PDF
    Due to the pressing need for new disease-modifying drugs for Alzheimer’s disease (AD), new treatment strategies and alternative drug targets are currently being heavily researched. One such strategy is to modulate protein kinases such as cyclin-dependent kinase 1 (CDK1), cyclin-dependent kinase 5 (CDK5), glycogen synthase kinase-3 (GSK-3α and GSK-3β), and the protein kinase RNA-like endoplasmic reticulum kinase (PERK). AD intervention by reduction of amyloid beta (Aβ) levels is also possible through development of protein kinase C-epsilon (PKC-ϵ) activators to recover α-secretase levels and decrease toxic Aβ levels, thereby restoring synaptogenesis and cognitive function. In this way, we aim to develop new AD drugs by targeting kinases that participate in AD pathophysiology. In our studies, comparative modeling was performed to construct 3D models for kinases whose crystal structures have not yet been identified. The information from structurally similar proteins was used to define the amino acid residues in the ATP binding site as well as other important sites and motifs. We searched for the comstructural motifs and domains of GSK-3β, CDK5 and PERK. Further, we identified the conserved water molecules in GSK-3β, CDK5 and PERK through calculation of the degree of water conservation. We investigated the protein-ligand interaction profiles of CDK1, CDK5, GSK-3α, GSK-3β and PERK based on molecular dynamics (MD) simulations, which provided a time-dependent demonstration of the interactions and contacts for each ligand. In addition, we explored the protein-protein interactions between CDK5 and p25. Small molecules which target this interaction may offer a prospective therapeutic benefit for AD. In order to identify new modulators for protein kinase targets in AD, we implemented three virtual screening protocols. The first protocol was a combined ligand- and protein structure-based approach to find new PERK inhibitors. In the second protocol, protein structure-based virtual screening was applied to find multiple-kinase inhibitors through parallel docking simulations into validated models of CDK1, CDK5 and GSK-3 kinases. In the third protocol, we searched for potential activators of PKC-ϵ based on the structure of its C1B domain

    Knowledge discovery on the integrative analysis of electrical and mechanical dyssynchrony to improve cardiac resynchronization therapy

    Get PDF
    Cardiac resynchronization therapy (CRT) is a standard method of treating heart failure by coordinating the function of the left and right ventricles. However, up to 40% of CRT recipients do not experience clinical symptoms or cardiac function improvements. The main reasons for CRT non-response include: (1) suboptimal patient selection based on electrical dyssynchrony measured by electrocardiogram (ECG) in current guidelines; (2) mechanical dyssynchrony has been shown to be effective but has not been fully explored; and (3) inappropriate placement of the CRT left ventricular (LV) lead in a significant number of patients. In terms of mechanical dyssynchrony, we utilize an autoencoder to extract new predictive features from nuclear medicine images, characterizing local mechanical dyssynchrony and improving the CRT response rate. Although machine learning can identify complex patterns and make accurate predictions from large datasets, the low interpretability of these black box methods makes it difficult to integrate them with clinical decisions made by physicians in the healthcare setting. Therefore, we use visualization techniques to enable physicians to understand the physical meaning of new features and the reasoning behind the clinical decisions made by the artificial intelligent model. For electrical dyssynchrony, we use short-time Fourier transform (STFT) to transform one-dimensional waveforms into two-dimensional frequency-time spectra. And transfer learning is used to leverage the knowledge learned from a large arrhythmia ECG dataset of related medical conditions to improve patient selection for CRT with limited data. This improves prediction accuracy, reduces the time and resources required, and potentially leads to better patient outcomes. Furthermore, an innovative approach is proposed for using three-dimensional spatial VCG information to describe the characteristics of electrical dyssynchrony, locate the latest activation site, and combine it with the latest mechanical contraction site to select the optimal LV lead position. In addition, we apply deep reinforcement learning to the decision-making problem of CRT patients. We investigate discrete state space/specific action space models to find the best treatment strategy, improve the reward equation based on the physician\u27s experience, and learn the approximation of the best action-value function that can improve the treatment policy used by clinicians and provide interpretability
    corecore