21 research outputs found
Drug Repurposing Using Deep Embeddings of Gene Expression Profiles
Computational drug repositioning
requires assessment of the functional
similarities among compounds. Here, we report a new method for measuring
compound functional similarity based on gene expression data. This
approach takes advantage of deep neural networks to learn an embedding
that substantially denoises expression data, making replicates of
the same compound more similar. Our method uses unlabeled data in
the sense that it only requires compounds to be labeled by identity
rather than detailed pharmacological information, which is often unavailable
and costly to obtain. Similarity in the learned embedding space accurately
predicted pharmacological similarities despite the lack of any such
labels during training and achieved substantially improved performance
in comparison with previous similarity measures applied to gene expression
measurements. Our method could identify drugs with shared therapeutic
and biological targets even when the compounds were structurally dissimilar,
thereby revealing previously unreported functional relationships between
compounds. Thus, our approach provides an improved engine for drug
repurposing based on expression data, which we have made available
through the online tool DeepCodex (http://deepcodex.org)
Visual Data Mining of Biological Networks: One Size Does Not Fit All
<div><p>High-throughput technologies produce massive amounts of data. However, individual methods yield data specific to the technique used and biological setup. The integration of such diverse data is necessary for the qualitative analysis of information relevant to hypotheses or discoveries. It is often useful to integrate these datasets using pathways and protein interaction networks to get a broader view of the experiment. The resulting network needs to be able to focus on either the large-scale picture or on the more detailed small-scale subsets, depending on the research question and goals. In this tutorial, we illustrate a workflow useful to integrate, analyze, and visualize data from different sources, and highlight important features of tools to support such analyses.</p> </div
Significant drugs affect more genes than other Connectivity Map drugs.
<p>We used CMap data to calculate the number of genes that were significantly differentially regulated (P < 0.05) for each of 1,309 drugs. Drugs that we identified as reversing the gene changes seen with lung cancer affected significantly more genes than other drugs (median of 8.5 vs. 3 genes; Wilcox test P << 0.01).</p
Drugs treat multiple subtypes of lung cancer.
<p>We ran CMapBatch on 10 adenocarcinoma signatures only, and on 6 squamous cell carcinoma signatures only. 79 drugs were common to the lists of top 100 drugs for both cancer subtypes.</p
Significant drugs share many protein targets.
<p><b>A</b>. In the drug-target network for drug candidates, two drugs are connected by an edge if they have the same protein target. Shown in colour are the drugs that slow growth in 5 or more lung cancer cell lines (blue), their immediate neighbours (purple), and the drugs that are structurally similar to them (green). Green edges indicate drug pairs that, in addition to sharing a protein target, were also found to be highly structurally similar (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004068#pcbi.1004068.g004" target="_blank">Fig. 4</a>). <b>B</b>. 83 significant drugs are represented in the drug-target network, and the largest connected component contains 72 drugs. 10,000 random draws of 83 drugs from the drug-target network resulted in smaller connected components (median size 42 drugs; P << 0.01).</p
Network visualization of the query genes and their involvement in other tumor types.
<p>a) Network built on aging and cancer genes. Labeled nodes belong to both gene lists. Square nodes represent cancer genes while diamonds represent aging genes. b) Deregulation of the network genes in lung (represented by down arrows) and ovarian cancer (represented by up arrows). The height and width of the nodes are proportional to the number of studies where the genes are deregulated. Node transparency corresponds to overall number of studies where the gene is deregulated. c) Network integrating chemical compounds targeting the query genes. Hexagonal nodes represent drugs. The names of the drugs interacting with the shared genes are shown. C: cancer genes, A: aging genes, D: drugs. C1, A1: genes interacting with shared ones, C2, A2: genes not interacting with the shared ones. D1: drugs targeting only aging genes, D2: drugs targeting both aging and cancer genes, D3: drugs targeting only cancer genes. Node colors represent GO categories as per legend. Edges are colored to differentiate inter- and intra-group interactions.</p
Twenty-one lung cancer gene signatures (tumour vs. normal comparisons).
<p>Twenty-one lung cancer gene signatures (tumour vs. normal comparisons).</p
Pimozide reduces viability in four lung cancer cell lines.
<p>Results of the MTT assay in A549, HCC4006, H1437, and H4006 cell lines. Bar height indicates the mean and error bars the standard deviation of 3 biological replicates. Y-axis shows percent viability relative to untreated cells. In each cancer cell line, pimozide shows a significant cytotoxic effect. Asterisk indicates P < 0.05 (t-test).</p
CMapBatch produces more stable lists of significant drugs than individual gene signatures.
<p>Shown are boxplots of the number of conserved drug candidates when any two lists of top 50 drug candidates are intersected. Green: 21 gene signatures were split into two disjoint sets of 10 and 11 signatures, CMapBatch was run on both sets, and top drugs from each set were compared; this experiment was repeated 100 times. Blue: 21 gene signatures were used to retrieve 21 lists of drugs with the CMap online tool; top drugs from all pairs of signatures were compared. Grey: 10 gene signatures of the same lung cancer type (adenocarcinoma) were used to retrieve 10 lists of drugs with the CMap online tool; top drugs from all pairs of signatures were compared. CMapBatch results showed a significantly higher median overlap (Wilcox test P << 0.01).</p