32 research outputs found

    Defining the Earliest Transcriptional Steps of Chondrogenic Progenitor Specification during the Formation of the Digits in the Embryonic Limb

    Get PDF
    The characterization of genes involved in the formation of cartilage is of key importance to improve cell-based cartilage regenerative therapies. Here, we have developed a suitable experimental model to identify precocious chondrogenic events in vivo by inducing an ectopic digit in the developing embryo. In this model, only 12 hr after the implantation of a Tgfβ bead, in the absence of increased cell proliferation, cartilage forms in undifferentiated interdigital mesoderm and in the course of development, becomes a structurally and morphologically normal digit. Systematic quantitative PCR expression analysis, together with other experimental approaches allowed us to establish 3 successive periods preceding the formation of cartilage. The “pre-condensation stage”, occurring within the first 3 hr of treatment, is characterized by the activation of connective tissue identity transcriptional factors (such as Sox9 and Scleraxis) and secreted factors (such as Activin A and the matricellular proteins CCN-1 and CCN-2) and the downregulation of the galectin CG-8. Next, the “condensation stage” is characterized by intense activation of Smad 1/5/8 BMP-signaling and increased expression of extracellular matrix components. During this period, the CCN matricellular proteins promote the expression of extracellular matrix and cell adhesion components. The third period, designated the “pre-cartilage period”, precedes the formation of molecularly identifiable cartilage by 2–3 hr and is characterized by the intensification of Sox 9 gene expression, along with the stimulation of other pro-chondrogenic transcription factors, such as HifIa. In summary, this work establishes a temporal hierarchy in the regulation of pro-chondrogenic genes preceding cartilage differentiation and provides new insights into the relative roles of secreted factors and cytoskeletal regulators that direct the first steps of this process in vivo

    Systematic interrogation of diverse Omic data reveals interpretable, robust, and generalizable transcriptomic features of clinically successful therapeutic targets.

    No full text
    Target selection is the first and pivotal step in drug discovery. An incorrect choice may not manifest itself for many years after hundreds of millions of research dollars have been spent. We collected a set of 332 targets that succeeded or failed in phase III clinical trials, and explored whether Omic features describing the target genes could predict clinical success. We obtained features from the recently published comprehensive resource: Harmonizome. Nineteen features appeared to be significantly correlated with phase III clinical trial outcomes, but only 4 passed validation schemes that used bootstrapping or modified permutation tests to assess feature robustness and generalizability while accounting for target class selection bias. We also used classifiers to perform multivariate feature selection and found that classifiers with a single feature performed as well in cross-validation as classifiers with more features (AUROC = 0.57 and AUPR = 0.81). The two predominantly selected features were mean mRNA expression across tissues and standard deviation of expression across tissues, where successful targets tended to have lower mean expression and higher expression variance than failed targets. This finding supports the conventional wisdom that it is favorable for a target to be present in the tissue(s) affected by a disease and absent from other tissues. Overall, our results suggest that it is feasible to construct a model integrating interpretable target features to inform target selection. We anticipate deeper insights and better models in the future, as researchers can reuse the data we have provided to improve methods for handling sample biases and learn more informative features. Code, documentation, and data for this study have been deposited on GitHub at https://github.com/arouillard/omic-features-successful-targets

    Predicting clinically promising therapeutic hypotheses using tensor factorization

    No full text
    Abstract Background Determining which target to pursue is a challenging and error-prone first step in developing a therapeutic treatment for a disease, where missteps are potentially very costly given the long-time frames and high expenses of drug development. With current informatics technology and machine learning algorithms, it is now possible to computationally discover therapeutic hypotheses by predicting clinically promising drug targets based on the evidence associating drug targets with disease indications. We have collected this evidence from Open Targets and additional databases that covers 17 sources of evidence for target-indication association and represented the data as a tensor of 21,437 × 2211 × 17. Results As a proof-of-concept, we identified examples of successes and failures of target-indication pairs in clinical trials across 875 targets and 574 disease indications to build a gold-standard data set of 6140 known clinical outcomes. We designed and executed three benchmarking strategies to examine the performance of multiple machine learning models: Logistic Regression, LASSO, Random Forest, Tensor Factorization and Gradient Boosting Machine. With 10-fold cross-validation, tensor factorization achieved AUROC = 0.82 ± 0.02 and AUPRC = 0.71 ± 0.03. Across multiple validation schemes, this was comparable or better than other methods. Conclusion In this work, we benchmarked a machine learning technique called tensor factorization for the problem of predicting clinical outcomes of therapeutic hypotheses. Results have shown that this method can achieve equal or better prediction performance compared with a variety of baseline models. We demonstrate one application of the method to predict outcomes of trials on novel indications of approved drug targets. This work can be expanded to targets and indications that have never been clinically tested and proposing novel target-indication hypotheses. Our proposed biologically-motivated cross-validation schemes provide insight into the robustness of the prediction performance. This has significant implications for all future methods that try to address this seminal problem in drug discovery

    Examples of successful tissue specific targets.

    No full text
    <p>Examples of successful tissue specific targets.</p

    Modeling pipeline.

    No full text
    <p>We trained a classifier to predict phase III clinical trial outcomes, using 5-fold cross-validation repeated 200 times to assess the stability of the classifier and estimate its generalization performance. For each fold of cross-validation, modeling began with the non-redundant features for each dataset. <b>Step 1:</b> We split the targets with phase III outcomes into training and testing sets. <b>Step 2:</b> We performed univariate feature selection using permutation tests to quantify the significance of the difference between the means of the successful and failed targets in the training examples. We controlled for target class as a confounding factor by only shuffling outcomes within target classes. We accepted features with adjusted p-values less than 0.05 after correcting for multiple hypothesis testing using the Benjamini-Yekutieli method. <b>Step 3:</b> We aggregated significant features from all datasets into a single feature matrix. <b>Step 4:</b> We performed incremental feature elimination with an inner 5-fold cross-validation loop repeated 20 times to select the type of classifier (Random Forest or logistic regression) and smallest subset of features that had cross-validation area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPR) values within 95% of maximum. <b>Step 5:</b> We refit the selected model using all the training examples and evaluated its performance on the test examples.</p

    Features significantly correlated with phase III outcome.

    No full text
    <p>Features significantly correlated with phase III outcome.</p

    Examples of failed tissue specific targets with plausible exceptions.

    No full text
    <p>Examples of failed tissue specific targets with plausible exceptions.</p

    Datasets tested for features significantly separating successful targets from failed targets.

    No full text
    <p>Datasets tested for features significantly separating successful targets from failed targets.</p

    Classifier performance.

    No full text
    <p><b>(A)</b> Receiver operating characteristic (ROC) curve. The solid black line indicates the median performance across 200 repetitions of 5-fold cross-validation and the gray area indicates the range of the 2.5 and 97.5 percentiles. The dotted black line indicates the performance of random rankings. <b>(B)</b> Distributions of the probability of success predicted by the classifier for the successful, failed, and unlabeled targets. <b>(C)</b> Precision-recall curve for success predictions. <b>(D)</b> Precision-recall curve for failure predictions. <b>(E)</b> Pairwise target comparisons. For each pair of targets, we computed the fraction of repetitions of cross-validation in which Target B had a higher predicted probability of success greater than Target A. The heatmap illustrates this fraction, thresholded at 0.95 or 0.99, plotted as a function of the median predicted probabilities of success of two targets. The upper left region is where the classifier is 95% (above solid black line) or 99% (above dotted blue line) consistent in predicting greater probability of success of Target B than Target A. <b>(F)</b> Relationship between features and phase III outcomes. Heat map showing the projection of the predicted success probabilities onto the two dominant features selected for the classifier: mean expression across tissues and standard deviation of expression across tissues. Red, white, and blue background colors correspond to 1, 0.5, and 0 success probabilities. Red plusses and blue crosses mark the locations of the success and failure examples. It appears the model has learned that failures tend to have high mean expression and low standard deviation of expression across tissues, while successes tend to have low mean expression and high standard deviation of expression. The success and failure examples are not well separated, indicating that we did not discover enough features to fully explain why targets succeed or fail in phase III clinical trials.</p

    Examples of failed ubiquitously expressed targets.

    No full text
    <p>Examples of failed ubiquitously expressed targets.</p
    corecore