4,037 research outputs found
Representation learning of drug and disease terms for drug repositioning
Drug repositioning (DR) refers to identification of novel indications for the
approved drugs. The requirement of huge investment of time as well as money and
risk of failure in clinical trials have led to surge in interest in drug
repositioning. DR exploits two major aspects associated with drugs and
diseases: existence of similarity among drugs and among diseases due to their
shared involved genes or pathways or common biological effects. Existing
methods of identifying drug-disease association majorly rely on the information
available in the structured databases only. On the other hand, abundant
information available in form of free texts in biomedical research articles are
not being fully exploited. Word-embedding or obtaining vector representation of
words from a large corpora of free texts using neural network methods have been
shown to give significant performance for several natural language processing
tasks. In this work we propose a novel way of representation learning to obtain
features of drugs and diseases by combining complementary information available
in unstructured texts and structured datasets. Next we use matrix completion
approach on these feature vectors to learn projection matrix between drug and
disease vector spaces. The proposed method has shown competitive performance
with state-of-the-art methods. Further, the case studies on Alzheimer's and
Hypertension diseases have shown that the predicted associations are matching
with the existing knowledge.Comment: Accepted to appear in 3rd IEEE International Conference on
Cybernetics (Spl Session: Deep Learning for Prediction and Estimation
Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.
Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions
Graph Representation Learning in Biomedicine
Biomedical networks are universal descriptors of systems of interacting
elements, from protein interactions to disease networks, all the way to
healthcare systems and scientific knowledge. With the remarkable success of
representation learning in providing powerful predictions and insights, we have
witnessed a rapid expansion of representation learning techniques into
modeling, analyzing, and learning with such networks. In this review, we put
forward an observation that long-standing principles of networks in biology and
medicine -- while often unspoken in machine learning research -- can provide
the conceptual grounding for representation learning, explain its current
successes and limitations, and inform future advances. We synthesize a spectrum
of algorithmic approaches that, at their core, leverage graph topology to embed
networks into compact vector spaces, and capture the breadth of ways in which
representation learning is proving useful. Areas of profound impact include
identifying variants underlying complex traits, disentangling behaviors of
single cells and their effects on health, assisting in diagnosis and treatment
of patients, and developing safe and effective medicines
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Knowledge-based Biomedical Data Science 2019
Knowledge-based biomedical data science (KBDS) involves the design and
implementation of computer systems that act as if they knew about biomedicine.
Such systems depend on formally represented knowledge in computer systems,
often in the form of knowledge graphs. Here we survey the progress in the last
year in systems that use formally represented knowledge to address data science
problems in both clinical and biological domains, as well as on approaches for
creating knowledge graphs. Major themes include the relationships between
knowledge graphs and machine learning, the use of natural language processing,
and the expansion of knowledge-based approaches to novel domains, such as
Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages
with 3 table
- …