5,722 research outputs found
DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks
Background and Objective: Heterogeneous complex networks are large graphs
consisting of different types of nodes and edges. The knowledge extraction from
these networks is complicated. Moreover, the scale of these networks is
steadily increasing. Thus, scalable methods are required. Methods: In this
paper, two distributed label propagation algorithms for heterogeneous networks,
namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type
of the heterogeneous complex networks. As a case study, we have measured the
efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network
consisting of drugs, diseases, and targets. The subject we have studied in this
network is drug repositioning but our algorithms can be used as general methods
for heterogeneous networks other than the biological network. Results: We
compared the proposed algorithms with similar non-distributed versions of them
namely MINProp and Heter-LP. The experiments revealed the good performance of
the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo
A multilayer network approach for guiding drug repositioning in neglected diseases
Drug development for neglected diseases has been historically hampered due to lack of market incentives. The advent of public domain resources containing chemical information from high throughput screenings is changing the landscape of drug discovery for these diseases. In this work we took advantage of data from extensively studied organisms like human, mouse, E. coli and yeast, among others, to develop a novel integrative network model to prioritize and identify candidate drug targets in neglected pathogen proteomes, and bioactive drug-like molecules. We modeled genomic (proteins) and chemical (bioactive compounds) data as a multilayer weighted network graph that takes advantage of bioactivity data across 221 species, chemical similarities between 1.7 105 compounds and several functional relations among 1.67 105 proteins. These relations comprised orthology, sharing of protein domains, and shared participation in defined biochemical pathways. We showcase the application of this network graph to the problem of prioritization of new candidate targets, based on the information available in the graph for known compound-target associations. We validated this strategy by performing a cross validation procedure for known mouse and Trypanosoma cruzi targets and showed that our approach outperforms classic alignment-based approaches. Moreover, our model provides additional flexibility as two different network definitions could be considered, finding in both cases qualitatively different but sensible candidate targets. We also showcase the application of the network to suggest targets for orphan compounds that are active against Plasmodium falciparum in high-throughput screens. In this case our approach provided a reduced prioritization list of target proteins for the query molecules and showed the ability to propose new testable hypotheses for each compound. Moreover, we found that some predictions highlighted by our network model were supported by independent experimental validations as found post-facto in the literature.Fil: Berenstein, Ariel José. Fundación Instituto Leloir; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Física; ArgentinaFil: Magariños, María Paula. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); ArgentinaFil: Chernomoretz, Ariel. Fundación Instituto Leloir; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Física; ArgentinaFil: Fernandez Aguero, Maria Jose. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentin
Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery
Recent research in representation learning utilizes large databases of
proteins or molecules to acquire knowledge of drug and protein structures
through unsupervised learning techniques. These pre-trained representations
have proven to significantly enhance the accuracy of subsequent tasks, such as
predicting the affinity between drugs and target proteins. In this study, we
demonstrate that by incorporating knowledge graphs from diverse sources and
modalities into the sequences or SMILES representation, we can further enrich
the representation and achieve state-of-the-art results on established
benchmark datasets. We provide preprocessed and integrated data obtained from 7
public sources, which encompass over 30M triples. Additionally, we make
available the pre-trained models based on this data, along with the reported
outcomes of their performance on three widely-used benchmark datasets for
drug-target binding affinity prediction found in the Therapeutic Data Commons
(TDC) benchmarks. Additionally, we make the source code for training models on
benchmark datasets publicly available. Our objective in releasing these
pre-trained models, accompanied by clean data for model pretraining and
benchmark results, is to encourage research in knowledge-enhanced
representation learning
DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches.
Motivation: Finding computationally drug-target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate.
Results: We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using 5-repeats of 10-fold cross-validation, three testing setups, and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 34% when the drugs are new, by 23% when targets are new and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs.
Availability and implementation: The data and code are provided at https://bitbucket.org/RSO24/ddr/.
Contact: [email protected].
Supplementary information: Supplementary data are available at Bioinformatics online
Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian hierarchical approach
Cellular response to a perturbation is the result of a dynamic system of
biological variables linked in a complex network. A major challenge in drug and
disease studies is identifying the key factors of a biological network that are
essential in determining the cell's fate.
Here our goal is the identification of perturbed pathways from
high-throughput gene expression data. We develop a three-level hierarchical
model, where (i) the first level captures the relationship between gene
expression and biological pathways using confirmatory factor analysis, (ii) the
second level models the behavior within an underlying network of pathways
induced by an unknown perturbation using a conditional autoregressive model,
and (iii) the third level is a spike-and-slab prior on the perturbations. We
then identify perturbations through posterior-based variable selection.
We illustrate our approach using gene transcription drug perturbation
profiles from the DREAM7 drug sensitivity predication challenge data set. Our
proposed method identified regulatory pathways that are known to play a
causative role and that were not readily resolved using gene set enrichment
analysis or exploratory factor models. Simulation results are presented
assessing the performance of this model relative to a network-free variant and
its robustness to inaccuracies in biological databases
A computational drug repositioning method applied to rare diseases : adrenocortical carcinoma
Rare or orphan diseases affect only small populations, thereby limiting the economic incentive for the drug development process, often resulting in a lack of progress towards treatment. Drug repositioning is a promising approach in these cases, due to its low cost. In this approach, one attempts to identify new purposes for existing drugs that have already been developed and approved for use. By applying the process of drug repositioning to identify novel treatments for rare diseases, we can overcome the lack of economic incentives and make concrete progress towards new therapies. Adrenocortical Carcinoma (ACC) is a rare disease with no practical and definitive therapeutic approach. We apply Heter-LP, a new method of drug repositioning, to suggest novel therapeutic avenues for ACC. Our analysis identifies innovative putative drug-disease, drug-target, and disease-target relationships for ACC, which include Cosyntropin (drug) and DHCR7, IGF1R, MC1R, MAP3K3, TOP2A (protein targets). When results are analyzed using all available information, a number of novel predicted associations related to ACC appear to be valid according to current knowledge. We expect the predicted relations will be useful for drug repositioning in ACC since the resulting ranked lists of drugs and protein targets can be used to expedite the necessary clinical processes
- …