31 research outputs found

    An eScience-Bayes strategy for analyzing omics data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The omics fields promise to revolutionize our understanding of biology and biomedicine. However, their potential is compromised by the challenge to analyze the huge datasets produced. Analysis of omics data is plagued by the curse of dimensionality, resulting in imprecise estimates of model parameters and performance. Moreover, the integration of omics data with other data sources is difficult to shoehorn into classical statistical models. This has resulted in <it>ad hoc </it>approaches to address specific problems.</p> <p>Results</p> <p>We present a general approach to omics data analysis that alleviates these problems. By combining eScience and Bayesian methods, we retrieve scientific information and data from multiple sources and coherently incorporate them into large models. These models improve the accuracy of predictions and offer new insights into the underlying mechanisms. This "eScience-Bayes" approach is demonstrated in two proof-of-principle applications, one for breast cancer prognosis prediction from transcriptomic data and one for protein-protein interaction studies based on proteomic data.</p> <p>Conclusions</p> <p>Bayesian statistics provide the flexibility to tailor statistical models to the complex data structures in omics biology as well as permitting coherent integration of multiple data sources. However, Bayesian methods are in general computationally demanding and require specification of possibly thousands of prior distributions. eScience can help us overcome these difficulties. The eScience-Bayes thus approach permits us to fully leverage on the advantages of Bayesian methods, resulting in models with improved predictive performance that gives more information about the underlying biological system.</p

    Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects

    No full text
    Proteochemometric (PCM) modelling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously. Hence it has been found to be particularly useful when exploring the selectivity and promiscuity of ligands on different proteins. In this review, we will firstly provide a brief introduction to the main concepts of PCM for readers new to the field. The next part focuses on recent technical advances, including the application of support vector machines (SVMs) using different kernel functions, random forests, Gaussian processes and collaborative filtering. The subsequent section will then describe some novel practical applications of PCM in the medicinal chemistry field, including studies on GPCRs, kinases, viral proteins (e.g.from HIV) and epigenetic targets such as histone deacetylases. Finally, we will conclude by summarizing novel developments in PCM, which we expect to gain further importance in the future. These developments include adding three-dimensional protein target information, application of PCM to the prediction of binding energies, and application of the concept in the fields of pharmacogenomics and toxicogenomics. This review is an update to a related publication in 2011 and it mainly focuses on developments in the field since then
    corecore