4,037 research outputs found

    Representation learning of drug and disease terms for drug repositioning

    Full text link
    Drug repositioning (DR) refers to identification of novel indications for the approved drugs. The requirement of huge investment of time as well as money and risk of failure in clinical trials have led to surge in interest in drug repositioning. DR exploits two major aspects associated with drugs and diseases: existence of similarity among drugs and among diseases due to their shared involved genes or pathways or common biological effects. Existing methods of identifying drug-disease association majorly rely on the information available in the structured databases only. On the other hand, abundant information available in form of free texts in biomedical research articles are not being fully exploited. Word-embedding or obtaining vector representation of words from a large corpora of free texts using neural network methods have been shown to give significant performance for several natural language processing tasks. In this work we propose a novel way of representation learning to obtain features of drugs and diseases by combining complementary information available in unstructured texts and structured datasets. Next we use matrix completion approach on these feature vectors to learn projection matrix between drug and disease vector spaces. The proposed method has shown competitive performance with state-of-the-art methods. Further, the case studies on Alzheimer's and Hypertension diseases have shown that the predicted associations are matching with the existing knowledge.Comment: Accepted to appear in 3rd IEEE International Conference on Cybernetics (Spl Session: Deep Learning for Prediction and Estimation

    Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.

    Get PDF
    Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions

    Graph Representation Learning in Biomedicine

    Full text link
    Biomedical networks are universal descriptors of systems of interacting elements, from protein interactions to disease networks, all the way to healthcare systems and scientific knowledge. With the remarkable success of representation learning in providing powerful predictions and insights, we have witnessed a rapid expansion of representation learning techniques into modeling, analyzing, and learning with such networks. In this review, we put forward an observation that long-standing principles of networks in biology and medicine -- while often unspoken in machine learning research -- can provide the conceptual grounding for representation learning, explain its current successes and limitations, and inform future advances. We synthesize a spectrum of algorithmic approaches that, at their core, leverage graph topology to embed networks into compact vector spaces, and capture the breadth of ways in which representation learning is proving useful. Areas of profound impact include identifying variants underlying complex traits, disentangling behaviors of single cells and their effects on health, assisting in diagnosis and treatment of patients, and developing safe and effective medicines

    edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

    Full text link
    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and knowledge discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, \textbf{edge2vec}\ significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical knowledge discovery applicability.Comment: 10 page

    Knowledge-based Biomedical Data Science 2019

    Full text link
    Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table
    • …
    corecore