Search CORE

15 research outputs found

An eScience-Bayes strategy for analyzing omics data

Author: A Gelman
A Gelman
A Isaksson
BP Carlin
C Desmedt
C Sotiriou
CF Taylor
CN Chi
CP Robert
D Milburn
D Muthas
D Talavera
EC Butcher
EL Kaplan
H Chuang
H Daumé III
HB Mann
HM Berman
Jarl ES Wikberg
JO Berger
JR Chen
L Ein-Dor
L Xu
LD Miller
M Xiao-Li
MA Stiffler
Martin Eklund
N Sha
O Spjuth
Ola Spjuth
P Murray-Rust
P Prusis
PCG da Costa
R Development Core Team
R Edgar
R Tonikian
RG Smock
RL Ho
S Gianni
S Lockless
S Michiels
SR Eddy
U Wickenberg-Bolin
Y Pawitan
Y Wang
Z Kutalik
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The omics fields promise to revolutionize our understanding of biology and biomedicine. However, their potential is compromised by the challenge to analyze the huge datasets produced. Analysis of omics data is plagued by the curse of dimensionality, resulting in imprecise estimates of model parameters and performance. Moreover, the integration of omics data with other data sources is difficult to shoehorn into classical statistical models. This has resulted in <it>ad hoc </it>approaches to address specific problems. Results We present a general approach to omics data analysis that alleviates these problems. By combining eScience and Bayesian methods, we retrieve scientific information and data from multiple sources and coherently incorporate them into large models. These models improve the accuracy of predictions and offer new insights into the underlying mechanisms. This "eScience-Bayes" approach is demonstrated in two proof-of-principle applications, one for breast cancer prognosis prediction from transcriptomic data and one for protein-protein interaction studies based on proteomic data. Conclusions Bayesian statistics provide the flexibility to tailor statistical models to the complex data structures in omics biology as well as permitting coherent integration of multiple data sources. However, Bayesian methods are in general computationally demanding and require specification of possibly thousands of prior distributions. eScience can help us overcome these difficulties. The eScience-Bayes thus approach permits us to fully leverage on the advantages of Bayesian methods, resulting in models with improved predictive performance that gives more information about the underlying biological system.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Challenges in funding and developing genomic software: roots and remedies

Author: Siepel A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/07/2019
Field of study

The computer software used for genomic analysis has become a crucial component of the infrastructure for life sciences. However, genomic software is still typically developed in an ad hoc manner, with inadequate funding, and by academic researchers not trained in software development, at substantial costs to the research community. I examine the roots of the incongruity between the importance of and the degree of investment in genomic software, and I suggest several potential remedies for current problems. As genomics continues to grow, new strategies for funding and developing the software that powers the field will become increasingly essential

Cold Spring Harbor Laboratory Institutional Repository

Linking the Resource Description Framework to cheminformatics and proteochemometrics

Author: Alvarsson Jonathan
Andersson Annsofie
Eklund Martin
Lampa Samuel
Lapins Maris
Spjuth Ola
Wikberg Jarl ES
Willighagen Egon L
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Semantic web technologies are finding their way into the life sciences. Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet. The semantic web technology Resource Description Framework (RDF) and related methods show to be sufficiently versatile to change that situation. Results The work presented here focuses on linking RDF approaches to existing molecular chemometrics fields, including cheminformatics, QSAR modeling and proteochemometrics. Applications are presented that link RDF technologies to methods from statistics and cheminformatics, including data aggregation, visualization, chemical identification, and property prediction. They demonstrate how this can be done using various existing RDF standards and cheminformatics libraries. For example, we show how IC50 and K<it>i</it> values are modeled for a number of biological targets using data from the ChEMBL database. Conclusions We have shown that existing RDF standards can suitably be integrated into existing molecular chemometrics methods. Platforms that unite these technologies, like Bioclipse, makes this even simpler and more transparent. Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility. The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.</p

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Statistical strategies for avoiding false discoveries in metabolomics and related experiments

Author: A. Bradford Hill
A. Cornish-Bowden
A. Demiriz
A. Goffeau
A. Golbraikh
A. Hutchinson
A. Linden
A. Reiner
A. Saltelli
A.C. Leon
A.C. Tas
A.H. Fielding
A.J. Miller
A.W.F. Edwards
B. Efron
B. Efron
B. Efron
B. Fortner
B. Shipley
B.F.J. Manly
B.K. Alsberg
B.K. Alsberg
B.R. Kirkwood
B.S. Everitt
C. Chatfield
C. Mering von
C. Rijsbergen van
C. Stephan
C.A. Coello
C.A. Goble
C.B. Lucasius
C.B. Lucasius
C.E. Metz
C.J. Needham
C.R. Hicks
D. Broadhurst
D. Camacho
D. di Bernardo
D. Edwards
D. Hand
D.A. Berry
D.A. Fell
D.A. Veldhuizen Van
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.B. Kell
D.C. Montgomery
D.F. Ransohoff
D.F. Ransohoff
D.F. Ransohoff
D.G. Altman
D.G. Altman
D.J.C. Mackay
D.S. Grimes
David I. Broadhurst
Douglas B. Kell
E. Jellum
E. Urbanczyk-Wochniak
E. Zitzler
E.C. Horning
E.E. Ntzani
E.F. Petricoin III
E.P. Diamandis
E.R. Gansner
E.R. Tufte
F. Kose
F.V. Jensen
G. Casella
G.A.F. Seber
G.E.P. Box
G.G. Harrigan
G.S. Catchpole
H. Brenner
H. Martens
H. White
H.-X. Li
H.C. Frey
H.L. Kirschenlohr
H.V. Westerhoff
H.W. Ressom
I.T. Jolliffe
J. Cornfield
J. Handl
J. Pearl
J. Pearl
J. Sacks
J. Zupan
J.A. Hanley
J.A. Todd
J.D. Barrow
J.D. Storey
J.D. Storey
J.E. Oakley
J.H. Zhang
J.J. Rowland
J.L. Ringuest
J.M. Bernardo
J.M. Bland
J.P. Egan
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.P. Ioannidis
J.R. Koza
J.R. Koza
J.W. Sammon Jr.
J.W. Tukey
K. Bennett
K. Deb
K.A. Baggerly
K.J. Rothman
L. Breiman
L. Breiman
L. Ein-Dor
L. Eriksson
L. Hubert
L. Wilkinson
L.A. Zadeh
L.G. Valiant
L.J. ‘t Veer van
L.M. Raamsdonk
M. Anthony
M. Bland
M. Brown
M. Cascante
M. Chen
M. Friendly
M. Hollander
M. Peleg
M. Ramoni
M. Woodward
M.B. Seasholtz
M.H. Zweig
M.J. Gardner
M.J. Vijver van de
M.J.A. Berry
M.S. Sehgal
N. Rifai
N.A. Obuchowski
O. Troyanskaya
O.P. Rud
P. Adriaans
P. Baldi
P. Cabena
P. Dasgupta
P. Duesberg
P. Eades
P. Langley
P. Romano
P.E. Rapp
P.R. Williamson
R. Bellman
R. Brent
R. Brent
R. Brent
R. Goodacre
R. Goodacre
R. Heinrich
R. Judson
R. Kruse
R. Royall
R. Steuer
R. Steuer
R. Stevens
R.E. Shaffer
R.F. Raubertas
R.G. Brereton
R.H. Myers
R.J. Cook
R.M. Jarvis
R.O. Duda
R.R. Sokal
S. Natarajan
S. O’Hagan
S. Wacholder
S. Wold
S.B. Crary
S.C. Potter
S.G. Baker
S.G. Oliver
S.H. Jung
S.H. Weiss
S.J. Sharp
S.K. Kim
S.M. Weiss
S.N. Deming
S.N. Goodman
T. Hastie
T. Kamada
T. Kohonen
T. Oinn
T. Oinn
T.A. White
T.M. Mitchell
T.M.D. Ebbels
T.M.J. Fruchterman
T.R. Golub
T.V. Perneger
U. Horchner
V.C.P. Chen
V.J. Gillet
V.N. Vapnik
W. Greenaway
W. Weckwerth
W.B. Kannel
W.B. Langdon
W.E. Evans
W.E. Evans
W.E. Evans
W.J. Conover
W.J. Krzanowski
W.S. Cleveland
W.S. Cleveland
X. Cui
X. Zhou
X.H. Zhou
Y. Benjamini
Y. Liang
Y. Tu
Y. Wang
Y. Xie
Z. Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Chemical Information Bulletin

Author: American Chemical Society. Division of Chemical Information.
Scalfani Vincent F.
Publication venue: American Chemical Society. Division of Chemical Information.
Publication date
Field of study

Created as a supplement for "the regular journals of the American Chemical Society," this publication contains annotated bibliographies of chemical documentation literature as well as information about meetings, conferences, awards, scholarships, and other news from the American Chemical Society (ACS) Division of Chemical Information (CINF)

UNT Digital Library

Towards the identification of Crohn’s disease-associated epigenetic biomarkers in leukocytes:A potential role in personalized diagnostics and treatment

Author: Li Yim A.Y.F.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Computational methods to explore hierarchical and modular structure of biological networks

Author: Park Yongjin
Publication venue: Johns Hopkins University
Publication date
Field of study

Networks have been widely used to understand structure of complex systems. From studying biological networks of protein-protein, genetic and other types of interactions, we gain insights into functional organization of static biological systems that could hardly be measured experimentally in current state-of-the-art technology. Biological networks also serve as a principled framework that integrates multiple sources of genome-wide data sets such as gene expression arrays and sequencing. Yet, a large-scale network is often intractable for intuitive visualization and computation. We developed novel network clustering algorithms to harness the power of genome-scale biological networks of all genes/proteins. Especially our algorithms were capable of finding hidden modular structures in hierarchical stochastic block model. Since the modules are organized hierarchically, our algorithms facilitate downstream analysis and design of in-depth validation experiments in ``divide-and-conquer'' strategy. Moreover, we present empirical evidence that the hierarchical and modular structure best explains observed biological networks. We used the static clustering methods in two ways. First we sought to extend the static methods to dynamic clustering problems, and observed general patterns of dynamics of network modules. For examples we demonstrate dynamics of yeast metabolic cycle and Arabidopsis root developmental process. Moreover, we propose a prioritization scheme that sorts identified network modules in the order of discriminative power. In the course of research we conclude that biological networks are best understood as hierarchically organized modules, and the modules remain stable in unperturbed biological process, but they can respond differently to abnormal / external perturbations such as knock-down of key enzymes

JScholarship

Towards the identification of Crohn’s disease-associated epigenetic biomarkers in leukocytes:A potential role in personalized diagnostics and treatment

Author: Li Yim A.Y.F.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Scientific Workflows for Metabolic Flux Analysis

Author: Dalman Tolga
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2016
Field of study

Metabolic engineering is a highly interdisciplinary research domain that interfaces biology, mathematics, computer science, and engineering. Metabolic flux analysis with carbon tracer experiments (13 C-MFA) is a particularly challenging metabolic engineering application that consists of several tightly interwoven building blocks such as modeling, simulation, and experimental design. While several general-purpose workflow solutions have emerged in recent years to support the realization of complex scientific applications, the transferability of these approaches are only partially applicable to 13C-MFA workflows. While problems in other research fields (e.g., bioinformatics) are primarily centered around scientific data processing, 13C-MFA workflows have more in common with business workflows. For instance, many bioinformatics workflows are designed to identify, compare, and annotate genomic sequences by "pipelining" them through standard tools like BLAST. Typically, the next workflow task in the pipeline can be automatically determined by the outcome of the previous step. Five computational challenges have been identified in the endeavor of conducting 13 C-MFA studies: organization of heterogeneous data, standardization of processes and the unification of tools and data, interactive workflow steering, distributed computing, and service orientation. The outcome of this thesis is a scientific workflow framework (SWF) that is custom-tailored for the specific requirements of 13 C-MFA applications. The proposed approach – namely, designing the SWF as a collection of loosely-coupled modules that are glued together with web services – alleviates the realization of 13C-MFA workflows by offering several features. By design, existing tools are integrated into the SWF using web service interfaces and foreign programming language bindings (e.g., Java or Python). Although the attributes "easy-to-use" and "general-purpose" are rarely associated with distributed computing software, the presented use cases show that the proposed Hadoop MapReduce framework eases the deployment of computationally demanding simulations on cloud and cluster computing resources. An important building block for allowing interactive researcher-driven workflows is the ability to track all data that is needed to understand and reproduce a workflow. The standardization of 13 C-MFA studies using a folder structure template and the corresponding services and web interfaces improves the exchange of information for a group of researchers. Finally, several auxiliary tools are developed in the course of this work to complement the SWF modules, i.e., ranging from simple helper scripts to visualization or data conversion programs. This solution distinguishes itself from other scientific workflow approaches by offering a system of loosely-coupled components that are flexibly arranged to match the typical requirements in the metabolic engineering domain. Being a modern and service-oriented software framework, new applications are easily composed by reusing existing components

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Juelich Shared Electronic Resources

Diffusion of Electronic Health Records:six Years of Empirical Data

Author: Andersen Stig Kjær
Bernstein Knut
Bruun-Rasmussen Morten
Nøhr Christian
Vingtoft Søren
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

VBN