2,975 research outputs found

    Representation learning of drug and disease terms for drug repositioning

    Full text link
    Drug repositioning (DR) refers to identification of novel indications for the approved drugs. The requirement of huge investment of time as well as money and risk of failure in clinical trials have led to surge in interest in drug repositioning. DR exploits two major aspects associated with drugs and diseases: existence of similarity among drugs and among diseases due to their shared involved genes or pathways or common biological effects. Existing methods of identifying drug-disease association majorly rely on the information available in the structured databases only. On the other hand, abundant information available in form of free texts in biomedical research articles are not being fully exploited. Word-embedding or obtaining vector representation of words from a large corpora of free texts using neural network methods have been shown to give significant performance for several natural language processing tasks. In this work we propose a novel way of representation learning to obtain features of drugs and diseases by combining complementary information available in unstructured texts and structured datasets. Next we use matrix completion approach on these feature vectors to learn projection matrix between drug and disease vector spaces. The proposed method has shown competitive performance with state-of-the-art methods. Further, the case studies on Alzheimer's and Hypertension diseases have shown that the predicted associations are matching with the existing knowledge.Comment: Accepted to appear in 3rd IEEE International Conference on Cybernetics (Spl Session: Deep Learning for Prediction and Estimation

    Repurposing drugs to target nonalcoholic steatohepatitis

    Get PDF
    Nonalcoholic fatty liver disease (NAFLD) is a complex disorder that has evolved in recent years as the leading global cause of chronic liver damage. The main obstacle to better disease management pertains to the lack of approved pharmacological interventions for the treatment of nonalcoholic steatohepatitis (NASH) and NASH-fibrosis-the severe histological forms. Over the past decade, tremendous advances have been made in NAFLD research, resulting in the discovery of disease mechanisms and novel therapeutic targets. Hence, a large number of pharmacological agents are currently being tested for safety and efficacy. These drugs are in the initial pharmacological phases (phase 1 and 2), which involve testing tolerability, therapeutic action, and pharmacological issues. It is thus reasonable to assume that the next generation of NASH drugs will not be available for clinical use for foreseeable future. The expected delay can be mitigated by drug repurposing or repositioning, which essentially relies on identifying and developing new uses for existing drugs. Here, we propose a drug candidate selection method based on the integration of molecular pathways of disease pathogenesis into network analysis tools that use OMICs data as well as multiples sources, including text mining from the medical literature.Fil: Sookoian, Silvia Cristina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Pirola, Carlos José. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; Argentin

    Herb Target Prediction Based on Representation Learning of Symptom related Heterogeneous Network.

    Get PDF
    Traditional Chinese Medicine (TCM) has received increasing attention as a complementary approach or alternative to modern medicine. However, experimental methods for identifying novel targets of TCM herbs heavily relied on the current available herb-compound-target relationships. In this work, we present an Herb-Target Interaction Network (HTINet) approach, a novel network integration pipeline for herb-target prediction mainly relying on the symptom related associations. HTINet focuses on capturing the low-dimensional feature vectors for both herbs and proteins by network embedding, which incorporate the topological properties of nodes across multi-layered heterogeneous network, and then performs supervised learning based on these low-dimensional feature representations. HTINet obtains performance improvement over a well-established random walk based herb-target prediction method. Furthermore, we have manually validated several predicted herb-target interactions from independent literatures. These results indicate that HTINet can be used to integrate heterogeneous information to predict novel herb-target interactions

    Systems approaches to drug repositioning

    Get PDF
    PhD ThesisDrug discovery has overall become less fruitful and more costly, despite vastly increased biomedical knowledge and evolving approaches to Research and Development (R&D). One complementary approach to drug discovery is that of drug repositioning which focusses on identifying novel uses for existing drugs. By focussing on existing drugs that have already reached the market, drug repositioning has the potential to both reduce the timeframe and cost of getting a disease treatment to those that need it. Many marketed examples of repositioned drugs have been found via serendipitous or rational observations, highlighting the need for more systematic methodologies. Systems approaches have the potential to enable the development of novel methods to understand the action of therapeutic compounds, but require an integrative approach to biological data. Integrated networks can facilitate systems-level analyses by combining multiple sources of evidence to provide a rich description of drugs, their targets and their interactions. Classically, such networks can be mined manually where a skilled person can identify portions of the graph that are indicative of relationships between drugs and highlight possible repositioning opportunities. However, this approach is not scalable. Automated procedures are required to mine integrated networks systematically for these subgraphs and bring them to the attention of the user. The aim of this project was the development of novel computational methods to identify new therapeutic uses for existing drugs (with particular focus on active small molecules) using data integration. A framework for integrating disparate data relevant to drug repositioning, Drug Repositioning Network Integration Framework (DReNInF) was developed as part of this work. This framework includes a high-level ontology, Drug Repositioning Network Integration Ontology (DReNInO), to aid integration and subsequent mining; a suite of parsers; and a generic semantic graph integration platform. This framework enables the production of integrated networks maintaining strict semantics that are important in, but not exclusive to, drug repositioning. The DReNInF is then used to create Drug Repositioning Network Integration (DReNIn), a semantically-rich Resource Description Framework (RDF) dataset. A Web-based front end was developed, which includes a SPARQL Protocol and RDF Query Language (SPARQL) endpoint for querying this dataset. To automate the mining of drug repositioning datasets, a formal framework for the definition of semantic subgraphs was established and a method for Drug Repositioning Semantic Mining (DReSMin) was developed. DReSMin is an algorithm for mining semantically-rich networks for occurrences of a given semantic subgraph. This algorithm allows instances of complex semantic subgraphs that contain data about putative drug repositioning opportunities to be identified in a computationally tractable fashion, scaling close to linearly with network data. The ability of DReSMin to identify novel Drug-Target (D-T) associations was investigated. 9,643,061 putative D-T interactions were identified and ranked, with a strong correlation between highly scored associations and those supported by literature observed. The 20 top ranked associations were analysed in more detail with 14 found to be novel and six found to be supported by the literature. It was also shown that this approach better prioritises known D-T interactions, than other state-of-the-art methodologies. The ability of DReSMin to identify novel Drug-Disease (Dr-D) indications was also investigated. As target-based approaches are utilised heavily in the field of drug discovery, it is necessary to have a systematic method to rank Gene-Disease (G-D) associations. Although methods already exist to collect, integrate and score these associations, these scores are often not a reliable re flection of expert knowledge. Therefore, an integrated data-driven approach to drug repositioning was developed using a Bayesian statistics approach and applied to rank 309,885 G-D associations using existing knowledge. Ranked associations were then integrated with other biological data to produce a semantically-rich drug discovery network. Using this network it was shown that diseases of the central nervous system (CNS) provide an area of interest. The network was then systematically mined for semantic subgraphs that capture novel Dr-D relations. 275,934 Dr-D associations were identified and ranked, with those more likely to be side-effects filtered. Work presented here includes novel tools and algorithms to enable research within the field of drug repositioning. DReNIn, for example, includes data that previous comparable datasets relevant to drug repositioning have neglected, such as clinical trial data and drug indications. Furthermore, the dataset may be easily extended using DReNInF to include future data as and when it becomes available, such as G-D association directionality (i.e. is the mutation a loss-of-function or gain-of-function). Unlike other algorithms and approaches developed for drug repositioning, DReSMin can be used to infer any types of associations captured in the target semantic network. Moreover, the approaches presented here should be more generically applicable to other fields that require algorithms for the integration and mining of semantically rich networks.European and Physical Sciences Research Council (EPSRC) and GS

    Drug prescription support in dental clinics through drug corpus mining

    Get PDF
    The rapid increase in the volume and variety of data poses a challenge to safe drug prescription for the dentist. The increasing number of patients that take multiple drugs further exerts pressure on the dentist to make the right decision at point-of-care. Hence, a robust decision support system will enable dentists to make decisions on drug prescription quickly and accurately. Based on the assumption that similar drug pairs have a higher similarity ratio, this paper suggests an innovative approach to obtain the similarity ratio between the drug that the dentist is going to prescribe and the drug that the patient is currently taking. We conducted experiments to obtain the similarity ratios of both positive and negative drug pairs, by using feature vectors generated from term similarities and word embeddings of biomedical text corpus. This model can be easily adapted and implemented for use in a dental clinic to assist the dentist in deciding if a drug is suitable for prescription, taking into consideration the medical profile of the patients. Experimental evaluation of our model’s association of the similarity ratio between two drugs yielded a superior F score of 89%. Hence, such an approach, when integrated within the clinical work flow, will reduce prescription errors and thereby increase the health outcomes of patients

    Drug repurposing: cost effectiveness and impact on emerging and neglected diseases

    Get PDF
    Historically, pressure on nature brought about by ever-increasing human pollution and technological advancement culminate in emergence and re- emergence of infectious and non-infectious diseases; necessitating medications and drug discovery and development. The emergence of resistantmicroorganisms and the emergence of new infections disease conditions necessitate the production of entirely new drugs or modification of the existing ones to increase their efficacy. The development of novel medications is a very long and expensive process. There is a significant decrease observed in the number of new drugs approved for clinical use in recent years showing inconsistency in the face of scientific advances and research and development investment. Regardless of high investment and enormous contributions, very few molecules showed promising results. However, finding novel indications for existing drugs can be a useful method of reducing the developmental cycle of drugs. Repositioning (also called repurposing) has been described as the practice of developing new therapeutic uses for drugs, abandoned or drugs in development process, other than the initially intended or approved uses, except for the circumstances in which the novel use is comparable to the original indication with dissimilar pharmacological targets. This review aimed at looking into some of the available methods in drug repurposing. Key words: Drug discovery; Drug repurposing; Machine learning; Neglected Tropical diseases; Orphan drug

    A multilayer network approach for guiding drug repositioning in neglected diseases

    Get PDF
    Drug development for neglected diseases has been historically hampered due to lack of market incentives. The advent of public domain resources containing chemical information from high throughput screenings is changing the landscape of drug discovery for these diseases. In this work we took advantage of data from extensively studied organisms like human, mouse, E. coli and yeast, among others, to develop a novel integrative network model to prioritize and identify candidate drug targets in neglected pathogen proteomes, and bioactive drug-like molecules. We modeled genomic (proteins) and chemical (bioactive compounds) data as a multilayer weighted network graph that takes advantage of bioactivity data across 221 species, chemical similarities between 1.7 105 compounds and several functional relations among 1.67 105 proteins. These relations comprised orthology, sharing of protein domains, and shared participation in defined biochemical pathways. We showcase the application of this network graph to the problem of prioritization of new candidate targets, based on the information available in the graph for known compound-target associations. We validated this strategy by performing a cross validation procedure for known mouse and Trypanosoma cruzi targets and showed that our approach outperforms classic alignment-based approaches. Moreover, our model provides additional flexibility as two different network definitions could be considered, finding in both cases qualitatively different but sensible candidate targets. We also showcase the application of the network to suggest targets for orphan compounds that are active against Plasmodium falciparum in high-throughput screens. In this case our approach provided a reduced prioritization list of target proteins for the query molecules and showed the ability to propose new testable hypotheses for each compound. Moreover, we found that some predictions highlighted by our network model were supported by independent experimental validations as found post-facto in the literature.Fil: Berenstein, Ariel José. Fundación Instituto Leloir; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Física; ArgentinaFil: Magariños, María Paula. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); ArgentinaFil: Chernomoretz, Ariel. Fundación Instituto Leloir; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Física; ArgentinaFil: Fernandez Aguero, Maria Jose. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentin

    Drug prescription support in dental clinics through drug corpus mining

    Get PDF
    The rapid increase in the volume and variety of data poses a challenge to safe drug prescription for the dentist. The increasing number of patients that take multiple drugs further exerts pressure on the dentist to make the right decision at point-of-care. Hence, a robust decision support system will enable dentists to make decisions on drug prescription quickly and accurately. Based on the assumption that similar drug pairs have a higher similarity ratio, this paper suggests an innovative approach to obtain the similarity ratio between the drug that the dentist is going to prescribe and the drug that the patient is currently taking. We conducted experiments to obtain the similarity ratios of both positive and negative drug pairs, by using feature vectors generated from term similarities and word embeddings of biomedical text corpus. This model can be easily adapted and implemented for use in a dental clinic to assist the dentist in deciding if a drug is suitable for prescription, taking into consideration the medical profile of the patients. Experimental evaluation of our model’s association of the similarity ratio between two drugs yielded a superior F score of 89%. Hence, such an approach, when integrated within the clinical work flow, will reduce prescription errors and thereby increase the health outcomes of patients

    Use of Text Data in Identifying and Prioritizing Potential Drug Repositioning Candidates

    Get PDF
    New drug development costs between 500 million and 2 billion dollars and takes 10-15 years, with a success rate of less than 10%. Drug repurposing (defined as discovering new indications for existing drugs) could play a significant role in drug development, especially considering the declining success rates of developing novel drugs. In the period 2007-2009, drug repurposing led to the launching of 30-40% of new drugs. Typically, new indications for existing medications are identified by accident. However, new technologies and a large number of available resources enable the development of systematic approaches to identify and validate drug-repurposing candidates with significantly lower cost. A variety of resources have been utilized to identify novel drug repurposing candidates such as biomedical literature, clinical notes, and genetic data. In this dissertation, we focused on using text data in identifying and prioritizing drug repositioning candidates and conducted five studies. In the first study, we aimed to assess the feasibility of using patient reviews from social media to identify potential candidates for drug repurposing. We retrieved patient reviews of 180 medications from an online forum, WebMD. Using dictionary-based and machine learning approaches, we identified disease names in the reviews. Several publicly available resources were used to exclude comments containing known indications and adverse drug effects. After manually reviewing some of the remaining comments, we implemented a rule-based system to identify beneficial effects. The dictionary-based system and machine learning system identified 2178 and 6171 disease names respectively in 64,616 patient comments. We provided a list of 10 common patterns that patients used to report any beneficial effects or uses of medication. After manually reviewing the comments tagged by our rule-based system, we identified five potential drug repurposing candidates. To our knowledge, this was the first study to consider using social media data to identify drug-repurposing candidates. We found that even a rule-based system, with a limited number of rules, could identify beneficial effect mentions in the comments of patients. Our preliminary study shows that social media has the potential to be used in drug repurposing. In the second study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature-based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in the Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need for discourse-level analysis in extracting the relations from biomedical literature. In the third and fourth study, we focused on prioritizing drug repositioning candidates extracted from biomedical literature which we refer to as Literature-Based Discovery (LBD). In the third study, we used drug-gene and gene-disease semantic predications extracted from Medline abstracts to generate a list of potential drug-disease pairs. We further ranked the generated pairs, by assigning scores based on the predicates that qualify drug-gene and gene-disease relationships. On comparing the top-ranked drug-disease pairs against the Comparative Toxicogenomics Database, we found that a significant percentage of top-ranked pairs appeared in CTD. Co-occurrence of these high-ranked pairs in Medline abstracts is then used to improve the rankings of the inferred drug-disease relations. Finally, manual evaluation of the top-ten pairs ranked by our approach revealed that nine of them have good potential for biological significance based on expert judgment. In the fourth study, we proposed a method, utilizing information surrounding causal findings, to prioritize discoveries generated by LBD systems. We focused on discovering drug-disease relations, which have the potential to identify drug repositioning candidates or adverse drug reactions. Our LBD system used drug-gene and gene-disease semantic predication in SemMedDB as causal findings and Swanson’s ABC model to generate potential drug-disease relations. Using sentences, as a source of causal findings, our ranking method trained a binary classifier to classify generated drug-disease relations into desired classes. We trained and tested our classifier for three different purposes: a) drug repositioning b) adverse drug-event detection and c) drug-disease relation detection. The classifier obtained 0.78, 0.86, and 0.83 F-measures respectively for these tasks. The number of causal findings of each hypothesis, which were classified as positive by the classifier, is the main metric for ranking hypotheses in the proposed method. To evaluate the ranking method, we counted and compared the number of true relations in the top 100 pairs, ranked by our method and one of the previous methods. Out of 181 true relations in the test dataset, the proposed method ranked 20 of them in the top 100 relations while this number was 13 for the other method. In the last study, we used biomedical literature and clinical trials in ranking potential drug repositioning candidates identified by Phenome-Wide Association Studies (PheWAS). Unlike previous approaches, in this study, we did not limit our method to LBD. First, we generated a list of potential drug repositioning candidates using PheWAS. We retrieved 212,851 gene-disease associations from PheWAS catalog and 14,169 gene-drug relationships from DrugBank. Following Swanson’s model, we generated 52,966 potential drug repositioning candidates. Then, we developed an information retrieval system to retrieve any evidence of those candidates co-occurring in the biomedical literature and clinical trials. We identified nearly 14,800 drug-disease pairs with some evidence of support. In addition, we identified more than 38,000 novel candidates for re-purposing, encompassing hundreds of different disease states and over 1,000 individual medications. We anticipate that these results will be highly useful for hypothesis generation in the field of drug repurposing

    Identifying Novel Drug Indications through Automated Reasoning

    Get PDF
    abstract: Background With the large amount of pharmacological and biological knowledge available in literature, finding novel drug indications for existing drugs using in silico approaches has become increasingly feasible. Typical literature-based approaches generate new hypotheses in the form of protein-protein interactions networks by means of linking concepts based on their cooccurrences within abstracts. However, this kind of approaches tends to generate too many hypotheses, and identifying new drug indications from large networks can be a time-consuming process. Methodology In this work, we developed a method that acquires the necessary facts from literature and knowledge bases, and identifies new drug indications through automated reasoning. This is achieved by encoding the molecular effects caused by drug-target interactions and links to various diseases and drug mechanism as domain knowledge in AnsProlog, a declarative language that is useful for automated reasoning, including reasoning with incomplete information. Unlike other literature-based approaches, our approach is more fine-grained, especially in identifying indirect relationships for drug indications. Conclusion/Significance To evaluate the capability of our approach in inferring novel drug indications, we applied our method to 943 drugs from DrugBank and asked if any of these drugs have potential anti-cancer activities based on information on their targets and molecular interaction types alone. A total of 507 drugs were found to have the potential to be used for cancer treatments. Among the potential anti-cancer drugs, 67 out of 81 drugs (a recall of 82.7%) are indeed known cancer drugs. In addition, 144 out of 289 drugs (a recall of 49.8%) are non-cancer drugs that are currently tested in clinical trials for cancer treatments. These results suggest that our method is able to infer drug indications (original or alternative) based on their molecular targets and interactions alone and has the potential to discover novel drug indications for existing drugs.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.004094
    • …
    corecore