Search CORE

7 research outputs found

Inferring causal molecular networks: empirical assessment through a community-based effort.

Author: A de la Fuente
AA Margolin
Adrian Bivol
Alexander J Bisberg
Alexander V Favorov
Amina A Qutub
Artem Sokolov
Bahman Afsari
BT Hennessy
Byron L Long
C Olsen
Chenyue W Hu
Chris K Wong
CM Chresta
D Freedman
D Husmeier
D Marbach
D Marbach
Dane Taylor
Daniel E Carlin
David P Noren
EG Cerami
EJ Molinelli
Elana J Fertig
Evan O Paull
F Eduati
F Eduati
F Markowetz
Fan Zhu
G Stolovitzky
G Stolovitzky
Gordon B Mills
Gustavo Stolovitzky
H Wang
Haizhou Wang
Heinz Koeppl
I Cantone
J Barretina
J Saez-Rodriguez
JC Costello
JMJ Derry
Joe W Gray
Joshua M Stuart
Julio Saez-Rodriguez
K Sachs
Kiley Graim
Laura M Heiser
Ludmila V Danilova
M Bansal
M Hecker
MH Maathuis
Michael Kellen
Michael Unger
Mingzhou Song
MJ Garnett
N Friedman
Nicole K Nesser
O Guitart-Pla
P Mertins
P Meyer
P Shannon
Paul T Spellman
R Akbani
R De Smet
R Tibes
RJ Prill
RJ Prill
RM Neve
Sach Mukherjee
SM Hill
SR Maetschke
Stephen Friend
Steven M Hill
T Cokelaer
T Ideker
Thea Norman
Thomas Cokelaer
Wai Shing Lee
WW Chen
Y Benjamini
Yang Zhang
Yuanfang Guan
Publication venue: Nat Methods
Publication date: 01/01/2015
Field of study

It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense

TUbiblio

Crossref

PubMed Central

eScholarship - University of California

Warwick Research Archives Portal Repository

Apollo (Cambridge)

DSpace at Rice University

Inferring causal molecular networks: empirical assessment through a community-based effort

Author: Afsari Bahman
Al-Ouran Rami
Anton Bernat
Arodz Tomasz
Bagheri Neda
Berlow Noah
Bisberg Alexander J.
Bivol Adrian
Bohler Anwesha
Bonet Jaume
Bonneau Richard
Budak Gungor
Bunescu Razvan
Caglar Mehmet
Cai Binghuang
Cai Chunhui
Carlin Daniel E.
Carlon Azzurra
Chen Lujia
Ciaccio Mark F.
Cokelaer Thomas
Cooper Gregory
Coort Susan
Creighton Chad J.
Daneshmand Seyed-Mohammad-Hadi
Danilova Ludmila V.
De La Fuente Alberto
Di Camillo Barbara
Dutta-Moscato Joyeeta
Emmett Kevin
Evelo Chris
Fassia Mohammad-Kasim H.
Favorov Alexander V.
Fertig Elana J.
Finkle Justin D.
Finotello Francesca
Friend Stephen
Gao Jean
Gao Xi
Ghosh Samik
Giaretta Alberto
Graim Kiley
Gray Joe W.
Großeholz Ruth
Guan Yuanfang
Guinney Justin
Hafemeister Christoph
Hahn Oliver
Haider Saad
Hase Takeshi
Heiser Laura M.
Hill Steven M.
Hodgson Jay
Hoff Bruce
Hsu Chih Hao
Hu Chenyue W.
Hu Ying
Huang Xun
Jalili Mahdi
Jiang Xia
Kacprowski Tim
Kaderali Lars
Kang Mingon
Kannan Venkateshan
Kellen Michael
Kikuchi Kaito
Kim Dong-Chul
Kitano Hiroaki
Knapp Bettina
Koeppl Heinz
Komatsoulis George
Krämer Andreas
Kursa Miron Bartosz
Kutmon Martina
Lee Wai Shing
Li Yichao
Liang Xiaoyu
Linger Michael
Liu Yu
Liu Zhaoqi
Long Byron L.
Lu Songjian
Lu Xinghua
Manfrini Marco
Matos Marta R. A.
Meerzaman Daoud
Mills Gordon B.
Min Wenwen
Mukherjee Sach
Müller Christian Lorenz
Neapolitan Richard E.
Nesser Nicole K.
Noren David P.
Norman Thea
Oliva Baldo
Opiyo Stephen Obol
Pal Ranadip
Palinkas Aljoscha
Paull Evan O.
Planas-Iglesias Joan
Poglayen Daniel
Qutub Amina A.
Saez-Rodriguez Julio
Sambo Francesco
Sanavia Tiziana
Sharifi-Zarchi Ali
Sichani Omid Askari
Slawek Janusz
Sokolov Artem
Song Mingzhou
Spellman Paul T.
Stolovitzky Gustavo
Streck Adam
Strunz Sonja
Stuart Joshua M.
Taylor Dane
Tegnér Jesper
Thobe Kirste
Toffolo Gianna Maria
Trifoglio Emanuele
Unger Michael
Wan Qian
Wang Haizhou
Welch Lonnie
Wong Chris K.
Wu Jia J.
Xue Albert Y.
Yamanaka Ryota
Yan Chunhua
Zairis Sakellarios
Zengerling Michael
Zenil Hector
Zhang Yang
Zhu Fan
Zi Zhike
Publication venue
Publication date: 01/01/2016
Field of study

Inferring molecular networks is a central challenge in computational biology. However, it has remained unclear whether causal, rather than merely correlational, relationships can be effectively inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge that focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results constitute the most comprehensive assessment of causal network inference in a mammalian setting carried out to date and suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess the causal validity of inferred molecular networks

Carolina Digital Repository

Inferring causal molecular networks: empirical assessment through a community-based effort

Author: Carlin Daniel E.
Cokelaer Thomas
Heiser Laura M.
Hill Steven M.
Kim Dongchul
Nesser Nicole K.
Paull Evan O.
Sokolov Artem
Unger Michael
Zhang Yang
Publication venue: ScholarWorks @ UTRGV
Publication date: 22/02/2016
Field of study

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

Inferring causal molecular networks: empirical assessment through a community-based effort

Author: Adam Streck
Afsari Bahman
Albert Y. Xue
Alberto de la Fuente
Ali Sharifi Zarchi
Aljoscha Palinkas
Andreas Kr&#228
Anwesha Bohler
Azzurra Carlon
Baldo Oliva
Bernat Anton
Bettina Knapp
Binghuang Cai
Bisberg Alexander J
Bivol Adrian
Bruce Hoff
Carlin Daniel E
Chad J. Creighton
Chenyue W Hu
Chih Hao Hsu
Chris Evelo
Christian Lorenz M&#252
Christoph Hafemeister
Chunhua Yan
Chunhui Cai
Cokelaer Thomas
Daniel Poglayen
Danilova Ludmila V
Daoud Meerzaman
Di Camillo Barbara
Dong Chul Kim
Dutta Moscato
Favorov Alexander V
Fertig Elana J
Finotello Francesca
Friend Stephen
Gao Xi
George Komatsoulis
Giaretta Alberto
Graim Kiley
Gray Joe W
Gregory Cooper
Guan Yuanfang
Gungor Budak
Hector Zenil
Heiser Laura M
Hill Steven M
Hiroaki Kitano
Hpn Dream Consortium: Rami Al Ouran
Janusz Slawek
Jaume Bonet
Javier Garcia Garcia
Jay Hodgson
Jean Gao
Jesper Tegn&#233
Jia J. Wu
Joan Planas Iglesias
Justin D Finkle
Justin Guinney
Kaito Kikuchi
Kellen Michael
Kevin Emmett
Kirste Thobe
Koeppl Heinz
Lars Kaderali
Lee Wai Shing
Liu Yu
Long Byron L
Lonnie Welch
Lujia Chen
Mahdi Jalili
Manfrini Marco
Mark F. Ciaccio
Marta R. A Matos
Martina Kutmon
Mehmet Caglar
Michael Zengerling
Mills Gordon B
Mingon Kang
Miron Bartosz Kursa
Mohammad Kasim H. Fassia
Mukherjee Sach
Neda Bagheri
Nesser Nicole K
Noah Berlow
Noren David P
Norman Thea
Oliver Hahn
Omid Askari Sichani
Paull Evan O
Qian Wan
Qutub Amina A
Ranadip Pal
Razvan Bunescu
Richard E Neapolitan
Richard Bonneau
Ruth Gro&#223
Ryota Yamanaka
Saad Haider
Saez Rodriguez Julio
Sakellarios Zairis
Sambo Francesco
Samik Ghosh
Sanavia Tiziana
Seyed Mohammad Hadi Daneshmand
Shihua Zhang
Sokolov Artem
Song Mingzhou
Songjian Lu
Sonja Strunz
Spellman Paul T
Stephen Obol Opiyo
Stolovitzky Gustavo
Stuart Joshua M
Susan Coort
Takeshi Hase
Taylor Dane
Tim Kacprowski
Toffolo Gianna Maria
Tomasz Arodz
Trifoglio Emanuele
Unger Michael
Venkateshan Kannan
Wang Haizhou
Wenwen Min
Wong Chris K
Xia Jiang
Xiaoyu Liang
Xinghua Lu
Xun Huang
Yichao Li
Ying Hu
Zhang Yang
Zhaoqi Liu
Zhike Zi
Zhu Fan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Institutional Research Information System University of Turin

Archivio istituzionale della ricerca - Università di Padova

Enabling network inference methods to handle missing data and outliers

Author: Fernández Villaverde Alejandro
Ferrer Riquelme Alberto José
Folch-Fortuny Abel
Rodríguez Banga Julio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

© 2015 Folch-Fortuny et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.[EN] Background: The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity and quality of the data greatly affect the results. While many methodologies have been developed for this task, they seldom take into account issues such as missing data or outlier detection and correction, which need to be properly addressed before network inference. Results: Here we present an approach to (i) handle missing data and (ii) detect and correct outliers based on multivariate projection to latent structures. The method, called trimmed scores regression (TSR), enables network inference methods to analyse incomplete datasets by imputing the missing values coherently with the latent data structure. Furthermore, it substitutes the faulty values in a dataset by proper estimations. We provide an implementation of this approach, and show how it can be integrated with any network inference method as a preliminary data curation step. This functionality is demonstrated with a state of the art network inference method based on mutual information distance and entropy reduction, MIDER. Conclusion: The methodology presented here enables network inference methods to analyse a large number of incomplete and faulty datasets that could not be reliably analysed so far. Our comparative studies show the superiority of TSR over other missing data approaches used by practitioners. Furthermore, the method allows for outlier detection and correction.Research in this study was partially supported by the European Union through project BioPreDyn (FP7-KBBE 289434), and the Spanish Ministry of Science and Innovation and FEDER funds from the European Union through grants MultiScales (DPI2011-28112-C04-02, DPI2011-28112-C04-03), and SynBioFactory (DPI2014-55276-C5-1-R, DPI2014-55276-C5-2-R). AF Villaverde also acknowledges funding from the Xunta de Galicia through an I2C postdoctoral fellowship (I2C ED481B 2014/133-0). We also gratefully acknowledge Associate Professor Francisco Arteaga for his help in the adaptation of TSR to the PCA model building context.Folch-Fortuny, A.; Fernández Villaverde, A.; Ferrer Riquelme, AJ.; Rodríguez Banga, J. (2015). Enabling network inference methods to handle missing data and outliers. BMC Bioinformatics. 16(283):1-12. https://doi.org/10.1186/s12859-015-0717-711216283Albert R, Barabási AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1):47–97.Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003; 45(2):167–256.De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nat Rev Microbiol. 2010; 8(10):717–29.Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci. 2010; 107(14):6286–291.Prill RJ, Saez-Rodriguez J, Alexopoulos LG, Sorger PK, Stolovitzky G. Crowdsourcing network inference: the DREAM predictive signaling network challenge. Sci Signal. 2011; 4(189):7.Lecca P, Priami C. Biological network inference for drug discovery. Drug Discovery Today. 2013; 18(5-6):256–64.Maetschke SR, Madhamshettiwar PB, Davis MJ, Ragan MA. Supervised, semi-supervised and unsupervised inference of gene regulatory networks. Brief Bioinform. 2013; 15(2):195–211.Grung B, Manne R. Missing values in principal component analysis. Chemometr Intell Lab Syst. 1998; 42(1-2):125–39.Arteaga F, Ferrer A. Missing data. In: Comprehensive chemometrics chemical and biochemical data analysis. Amsterdam: Elsevier: 2009. p. 285–314.Jackson JE. A user’s guide to principal components. Hoboken: Wiley Ser Probab Stat; 2004.Walczak B, Massart DL. Dealing with missing data. Chemometr Intell Lab Syst. 2001; 58(1):15–27.Martens H, Jr Russwurm H. Food research and data analysis. London; New York, NY, USA: Elsevier Applied Science; 1983.Arteaga F, Ferrer A. Dealing with missing data in MSPC: Several methods, different interpretations, some examples. J Chemom. 2002; 16(8-10):408–18.Folch-Fortuny A, Arteaga F, Ferrer A. PCA model building with missing data: new proposals and a comparative study. Chemometr Intell Lab Syst. 2015; 146:77–88.Liao SG, Lin Y, Kang DD, Chandra D, Bon J, Kaminski N, et al.Missing value imputation in high-dimensional phenomic data: imputable or not, and how?BMC Bioinforma. 2014; 15(1):346.Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometr Intell Lab Syst. 1987; 2(1-3):37–52.Kourti T, MacGregor JF. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometr Intell Lab Syst. 1995; 28(1):3–21.Ferrer A. Latent structures-based multivariate statistical process control: A paradigm shift. Qual Eng. 2014; 26(1):72–91.Villaverde AF, Ross J, Morán F, Banga JR. MIDER: Network inference with mutual information distance and entropy reduction. PLoS ONE. 2014; 9(5):96732.Shannon CE. A mathematical theory of communication. Bell Sys Tech J. 1948; 27(3):379–423.Cover TM, Thomas JA. Elements of information theory, 99 ed. New York: Wiley-Interscience; 1991.Villaverde AF, Ross J, Banga JR. Reverse engineering cellular networks with information theoretic methods. Cells. 2013; 2(2):306–29.Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, et al.Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007; 5(1):8.Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, et al.ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 2006; 7(Suppl 1):7.Meyer PE, Kontos K, Lafitte F, Bontempi G. Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinforma Syst Biol. 2007; 2007(1):79879.Luo W, Hankenson KD, Woolf PJ. Learning transcriptional regulatory networks from high throughput gene expression data using continuous three-way mutual information. BMC Bioinforma. 2008; 9:467.Zoppoli P, Morganella S, Ceccarelli M. TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach. BMC bioinforma. 2010; 11:154.Wu CC, Huang HC, Juan HF, Chen ST. GeneNetwork: an interactive tool for reconstruction of genetic networks using microarray data. Bioinformatics (Oxford, England). 2004; 20(18):3691–693.Gustafsson M, Hörnquist M, Lombardi A. Constructing and analyzing a large-scale gene-to-gene regulatory network–lasso-constrained inference and biological validation. IEEE/ACM trans comput biol bioinform/IEEE, ACM. 2005; 2(3):254–61.Guthke R, Möller U, Hoffmann M, Thies F, Töpfer S. Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection. Bioinformatics (Oxford, England). 2005; 21(8):1626–34.Schulze S, Henkel SG, Driesch D, Guthke R, Linde J. Computational prediction of molecular pathogen-host interactions based on dual transcriptome data. Front Microbiol. 2015; 6:65.Hurley D, Araki H, Tamada Y, Dunmore B, Sanders D, Humphreys S, et al.Gene network inference and visualization tools for biologists: application to new human transcriptome datasets. Nucleic Acids Res. 2012; 40(6):2377–398.Souto MCd, Jaskowiak PA, Costa IG. Impact of missing data imputation methods on gene expression clustering and classification. BMC Bioinforma. 2015; 16(1):64.Guitart-Pla O, Kustagi M, Rügheimer F, Califano A, Schwikowski B. The Cyni framework for network inference in Cytoscape. Bioinformatics (Oxford, England). 2015; 31(9):1499–1501.Camacho J, Picó J, Ferrer A. Data understanding with PCA: Structural and variance information plots. Chemometr Intell Lab Syst. 2010; 100(1):48–56.Wold S. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics. 1978; 20(4):397–405.Camacho J, Ferrer A. Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: theoretical aspects. J Chemom. 2012; 26(7):361–73.Little RJA, Rubin DB. Statistical analysis with missing data, 2nd ed. Hoboken, NJ: Wiley-Interscience; 2002.Ferrer A. Multivariate statistical process control based on principal component analysis (MSPC-PCA): Some reflections and a case study in an autobody assembly process. Qual Eng. 2007; 19(4):311–25.MacGregor JF, Kourti T. Statistical process control of multivariate processes. Control Eng Pract. 1995; 3(3):403–14.Stanimirova I, Daszykowski M, Walczak B. Dealing with missing values and outliers in principal component analysis. Talanta. 2007; 72(1):172–8.Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev Comput Stat. 2010; 2(4):433–59.Camacho J, Picó J, Ferrer A. The best approaches in the on-line monitoring of batch processes based on PCA: Does the modelling structure matter?Anal Chim Acta. 2009; 642(1-2):59–68.González-Martínez JM, de Noord OE, Ferrer A. Multisynchro: a novel approach for batch synchronization in scenarios of multiple asynchronisms. J Chemom. 2014; 28(5):462–75.Samoilov MS. Reconstruction and Functional Analysis of General Chemical Reactions and Reaction Networks. California, United States: Stanford University; 1997.Samoilov M, Arkin A, Ross J. On the deduction of chemical reaction pathways from measurements of time series of concentrations. Chaos (Woodbury, NY). 2001; 11(1):108–14.Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, et al.A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell. 2009; 137(1):172–81.Arkin A, Shen P, Ross J. A test case of correlation metric construction of a reaction pathway from measurements. Science. 1997; 277(5330):1275–9.Schaffter T, Marbach D, Floreano D. GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics (Oxford, England). 2011; 27(16):2263–270.Marbach D, Schaffter T, Mattiussi C, Floreano D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J Comput Biol J Comput Mol Cell Biol. 2009; 16(2):229–39

Universidade do Minho: RepositoriUM

Crossref

Springer - Publisher Connector

PubMed Central

RiuNet

Digital.CSIC

Chemometric Approaches for Systems Biology

Author: Folch Fortuny Abel
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 23/01/2017
Field of study

The present Ph.D. thesis is devoted to study, develop and apply approaches commonly used in chemometrics to the emerging field of systems biology. Existing procedures and new methods are applied to solve research and industrial questions in different multidisciplinary teams. The methodologies developed in this document will enrich the plethora of procedures employed within omic sciences to understand biological organisms and will improve processes in biotechnological industries integrating biological knowledge at different levels and exploiting the software packages derived from the thesis. This dissertation is structured in four parts. The first block describes the framework in which the contributions presented here are based. The objectives of the two research projects related to this thesis are highlighted and the specific topics addressed in this document via conference presentations and research articles are introduced. A comprehensive description of omic sciences and their relationships within the systems biology paradigm is given in this part, jointly with a review of the most applied multivariate methods in chemometrics, on which the novel approaches proposed here are founded. The second part addresses many problems of data understanding within metabolomics, fluxomics, proteomics and genomics. Different alternatives are proposed in this block to understand flux data in steady state conditions. Some are based on applications of multivariate methods previously applied in other chemometrics areas. Others are novel approaches based on a bilinear decomposition using elemental metabolic pathways, from which a GNU licensed toolbox is made freely available for the scientific community. As well, a framework for metabolic data understanding is proposed for non-steady state data, using the same bilinear decomposition proposed for steady state data, but modelling the dynamics of the experiments using novel two and three-way data analysis procedures. Also, the relationships between different omic levels are assessed in this part integrating different sources of information of plant viruses in data fusion models. Finally, an example of interaction between organisms, oranges and fungi, is studied via multivariate image analysis techniques, with future application in food industries. The third block of this thesis is a thoroughly study of different missing data problems related to chemometrics, systems biology and industrial bioprocesses. In the theoretical chapters of this part, new algorithms to obtain multivariate exploratory and regression models in the presence of missing data are proposed, which serve also as preprocessing steps of any other methodology used by practitioners. Regarding applications, this block explores the reconstruction of networks in omic sciences when missing and faulty measurements appear in databases, and how calibration models between near infrared instruments can be transferred, avoiding costs and time-consuming full recalibrations in bioindustries and research laboratories. Finally, another software package, including a graphical user interface, is made freely available for missing data imputation purposes. The last part discusses the relevance of this dissertation for research and biotechnology, including proposals deserving future research.Esta tesis doctoral se centra en el estudio, desarrollo y aplicación de técnicas quimiométricas en el emergente campo de la biología de sistemas. Procedimientos comúnmente utilizados y métodos nuevos se aplican para resolver preguntas de investigación en distintos equipos multidisciplinares, tanto del ámbito académico como del industrial. Las metodologías desarrolladas en este documento enriquecen la plétora de técnicas utilizadas en las ciencias ómicas para entender el funcionamiento de organismos biológicos y mejoran los procesos en la industria biotecnológica, integrando conocimiento biológico a diferentes niveles y explotando los paquetes de software derivados de esta tesis. Esta disertación se estructura en cuatro partes. El primer bloque describe el marco en el cual se articulan las contribuciones aquí presentadas. En él se esbozan los objetivos de los dos proyectos de investigación relacionados con esta tesis. Asimismo, se introducen los temas específicos desarrollados en este documento mediante presentaciones en conferencias y artículos de investigación. En esta parte figura una descripción exhaustiva de las ciencias ómicas y sus interrelaciones en el paradigma de la biología de sistemas, junto con una revisión de los métodos multivariantes más aplicados en quimiometría, que suponen las pilares sobre los que se asientan los nuevos procedimientos aquí propuestos. La segunda parte se centra en resolver problemas dentro de metabolómica, fluxómica, proteómica y genómica a partir del análisis de datos. Para ello se proponen varias alternativas para comprender a grandes rasgos los datos de flujos metabólicos en estado estacionario. Algunas de ellas están basadas en la aplicación de métodos multivariantes propuestos con anterioridad, mientras que otras son técnicas nuevas basadas en descomposiciones bilineales utilizando rutas metabólicas elementales. A partir de éstas se ha desarrollado software de libre acceso para la comunidad científica. A su vez, en esta tesis se propone un marco para analizar datos metabólicos en estado no estacionario. Para ello se adapta el enfoque tradicional para sistemas en estado estacionario, modelando las dinámicas de los experimentos empleando análisis de datos de dos y tres vías. En esta parte de la tesis también se establecen relaciones entre los distintos niveles ómicos, integrando diferentes fuentes de información en modelos de fusión de datos. Finalmente, se estudia la interacción entre organismos, como naranjas y hongos, mediante el análisis multivariante de imágenes, con futuras aplicaciones a la industria alimentaria. El tercer bloque de esta tesis representa un estudio a fondo de diferentes problemas relacionados con datos faltantes en quimiometría, biología de sistemas y en la industria de bioprocesos. En los capítulos más teóricos de esta parte, se proponen nuevos algoritmos para ajustar modelos multivariantes, tanto exploratorios como de regresión, en presencia de datos faltantes. Estos algoritmos sirven además como estrategias de preprocesado de los datos antes del uso de cualquier otro método. Respecto a las aplicaciones, en este bloque se explora la reconstrucción de redes en ciencias ómicas cuando aparecen valores faltantes o atípicos en las bases de datos. Una segunda aplicación de esta parte es la transferencia de modelos de calibración entre instrumentos de infrarrojo cercano, evitando así costosas re-calibraciones en bioindustrias y laboratorios de investigación. Finalmente, se propone un paquete software que incluye una interfaz amigable, disponible de forma gratuita para imputación de datos faltantes. En la última parte, se discuten los aspectos más relevantes de esta tesis para la investigación y la biotecnología, incluyendo líneas futuras de trabajo.Aquesta tesi doctoral es centra en l'estudi, desenvolupament, i aplicació de tècniques quimiomètriques en l'emergent camp de la biologia de sistemes. Procediments comúnment utilizats i mètodes nous s'apliquen per a resoldre preguntes d'investigació en diferents equips multidisciplinars, tant en l'àmbit acadèmic com en l'industrial. Les metodologies desenvolupades en aquest document enriquixen la plétora de tècniques utilitzades en les ciències òmiques per a entendre el funcionament d'organismes biològics i milloren els processos en la indústria biotecnològica, integrant coneixement biològic a distints nivells i explotant els paquets de software derivats d'aquesta tesi. Aquesta dissertació s'estructura en quatre parts. El primer bloc descriu el marc en el qual s'articulen les contribucions ací presentades. En ell s'esbossen els objectius dels dos projectes d'investigació relacionats amb aquesta tesi. Així mateix, s'introduixen els temes específics desenvolupats en aquest document mitjançant presentacions en conferències i articles d'investigació. En aquesta part figura una descripació exhaustiva de les ciències òmiques i les seues interrelacions en el paradigma de la biologia de sistemes, junt amb una revisió dels mètodes multivariants més aplicats en quimiometria, que supossen els pilars sobre els quals s'assenten els nous procediments ací proposats. La segona part es centra en resoldre problemes dins de la metabolòmica, fluxòmica, proteòmica i genòmica a partir de l'anàlisi de dades. Per a això es proposen diverses alternatives per a compendre a grans trets les dades de fluxos metabòlics en estat estacionari. Algunes d'elles estàn basades en l'aplicació de mètodes multivariants propostos amb anterioritat, mentre que altres són tècniques noves basades en descomposicions bilineals utilizant rutes metabòliques elementals. A partir d'aquestes s'ha desenvolupat software de lliure accés per a la comunitat científica. Al seu torn, en aquesta tesi es proposa un marc per a analitzar dades metabòliques en estat no estacionari. Per a això s'adapta l'enfocament tradicional per a sistemes en estat estacionari, modelant les dinàmiques dels experiments utilizant anàlisi de dades de dues i tres vies. En aquesta part de la tesi també s'establixen relacions entre els distints nivells òmics, integrant diferents fonts d'informació en models de fusió de dades. Finalment, s'estudia la interacció entre organismes, com taronges i fongs, mitjançant l'anàlisi multivariant d'imatges, amb futures aplicacions a la indústria alimentària. El tercer bloc d'aquesta tesi representa un estudi a fons de diferents problemes relacionats amb dades faltants en quimiometria, biologia de sistemes i en la indústria de bioprocessos. En els capítols més teòrics d'aquesta part, es proposen nous algoritmes per a ajustar models multivariants, tant exploratoris com de regressió, en presencia de dades faltants. Aquests algoritmes servixen ademés com a estratègies de preprocessat de dades abans de l'ús de qualsevol altre mètode. Respecte a les aplicacions, en aquest bloc s'explora la reconstrucció de xarxes en ciències òmiques quan apareixen valors faltants o atípics en les bases de dades. Una segona aplicació d'aquesta part es la transferència de models de calibració entre instruments d'infrarroig proper, evitant així costoses re-calibracions en bioindústries i laboratoris d'investigació. Finalment, es proposa un paquet software que inclou una interfície amigable, disponible de forma gratuïta per a imputació de dades faltants. En l'última part, es discutixen els aspectes més rellevants d'aquesta tesi per a la investigació i la biotecnologia, incloent línies futures de treball.Folch Fortuny, A. (2016). Chemometric Approaches for Systems Biology [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/77148TESISPremios Extraordinarios de tesis doctorale

Crossref

RiuNet

The Cyni framework for network inference in Cytoscape

Author: A. Califano
B. Schwikowski
Cline
F. Rugheimer
M. Kustagi
Marbach
O. Guitart-Pla
Oba
Poultney
Shannon
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref