3 research outputs found

    Distributed Bayesian networks reconstruction on the whole genome scale

    Get PDF
    Background Bayesian networks are directed acyclic graphical models widely used to represent the probabilistic relationships between random variables. They have been applied in various biological contexts, including gene regulatory networks and protein–protein interactions inference. Generally, learning Bayesian networks from experimental data is NP-hard, leading to widespread use of heuristic search methods giving suboptimal results. However, in cases when the acyclicity of the graph can be externally ensured, it is possible to find the optimal network in polynomial time. While our previously developed tool BNFinder implements polynomial time algorithm, reconstructing networks with the large amount of experimental data still leads to computations on single CPU growing exceedingly. Results In the present paper we propose parallelized algorithm designed for multi-core and distributed systems and its implementation in the improved version of BNFinder—tool for learning optimal Bayesian networks. The new algorithm has been tested on different simulated and experimental datasets showing that it has much better efficiency of parallelization than the previous version. BNFinder gives comparable results in terms of accuracy with respect to current state-of-the-art inference methods, giving significant advantage in cases when external information such as regulators list or prior edge probability can be introduced, particularly for datasets with static gene expression observations. Conclusions We show that the new method can be used to reconstruct networks in the size range of thousands of genes making it practically applicable to whole genome datasets of prokaryotic systems and large components of eukaryotic genomes. Our benchmarking results on realistic datasets indicate that the tool should be useful to a wide audience of researchers interested in discovering dependencies in their large-scale transcriptomic datasets

    Gene regulatory networks of the sucrose metabolism in sugarcane using bayesian networks

    Get PDF
    Orientador: Renato Vicentini dos SantosDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de BiologiaResumo: A cana-de-açĂșcar Ă© uma das mais importantes plantas cultivadas no Brasil, que Ă© o maior produtor e exportador mundial. Seu valor econĂŽmico Ă© devido principalmente a sua capacidade de estocar sacarose nos colmos. Os padrĂ”es de expressĂŁo gĂȘnica podem regular processos de desenvolvimento da planta e influenciar no acĂșmulo de sacarose em tecidos de reserva. A regulação desses padrĂ”es ocorre atravĂ©s de complexos sistemas de interaçÔes entre muitos genes e seus produtos, resultando em uma complexa rede de regulação gĂȘnica. Modelos grĂĄficos probabilĂ­sticos tĂȘm sido amplamente utilizados para inferĂȘncia e representação dessas redes. Dentre eles, as redes bayesianas sĂŁo o principal por ser considerado o mĂ©todo mais flexĂ­vel e tambĂ©m requererem um nĂșmero reduzido de parĂąmetros para a descrição do modelo. Sendo assim, este estudo utilizou a metodologia de redes bayesianas para inferĂȘncia de interaçÔes regulatĂłrias entre genes de metabolismo e sinalização de sacarose a partir de dados de expressĂŁo gĂȘnica, obtidos atravĂ©s de microarrays, disponĂ­veis no Gene Expression Omnibus (GEO). As redes foram obtidas atravĂ©s de softwares para inferĂȘncia de redes e entĂŁo analisadas quanto aos genes que as compĂ”em e padrĂ”es de expressĂŁo. Os genes foram agrupados em clusters considerando-se seus padrĂ”es de coexpressĂŁo. Os genes mais representados no cluster da enzima sacarose fosfato sintase (SPS) em cana sĂŁo genes de relacionados Ă  tradução, ligação ao DNA e genes de função desconhecida, enquanto os menos representados sĂŁo de fotossĂ­ntese, resposta a hormĂŽnios, e outros eventos metabĂłlicos. A rede do cluster da SPS apresentou sete genes principais (hubs) que aparentam ter um importante papel dentro do cluster. Foi obtida tambĂ©m uma rede considerando genes selecionados em estudos com experimentos de microarrays previamente publicados. Uma dessas redes possui 136 genes e apresentou 6 genes principais, sendo que a maioria deles Ă© de fotossĂ­ntese. Na rede considerando genes diferencialmente expressos nesses experimentos (265 genes), genes que pertencem Ă  mesma categoria funcional tenderam a sofrer regulação por um Ășnico gene em comum, formando grupos de funçÔes semelhantes em cada hubAbstract: Sugarcane is one of the most important plants cultivated in Brazil which is the world's largest producer and exporter. Its economic yield is mainly due to its high sucrose content. The patterns of gene expression may regulate processes of plant development and influence the accumulation of sucrose by storage tissues. The regulation of these patterns occurs through complex systems of interactions between many genes and their products, resulting in a complex gene regulatory network. Probabilistic graphical models have been widely used for inference and representation of these networks. Among them, Bayesian networks are the main for being considered to be the most flexible method and also requiring a reduced number of parameters to the model description. Then, this work has used the Bayesian network methodology for inference of regulatory interactions between signaling and sucrose metabolism genes from gene expression data, obtained from microarrays, available on Gene Expression Omnibus (GEO). Networks were generated by networks inference softwares, and then analyzed observing their composing genes and expression patterns. The genes were grouped considering their coexpression patterns. The most represented genes in the sacarose phosphate syntase (SPS) cluster are related with translation, DNA biding and unknown function genes while the least represented are of photosynthesis, hormone response and other metabolic events. The SPS cluster network presented 7 main hubs that seem to play an important role in the cluster. It was also obtained a network considering genes selected from studies with microarray experiments previously published. One of these gene networks has 136 genes and it presented 6 main genes, being the most of them are from photosynthesis. In the network considering differential expressed in this experiments, genes that are from the same functional category tended to suffer regulation for one unique common gene, forming groups of genes with similar function on each hubMestradoGenetica Vegetal e MelhoramentoMestra em GenĂ©tica e Biologia Molecula
    corecore