Search CORE

71 research outputs found

BNFinder: exact and efficient method for learning Bayesian networks

Author: B. Wilczynski
Beer
Dojer
Husmeier
N. Dojer
Needham
Smith
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Bayesian methods are widely used in many different areas of research. Recently, it has become a very popular tool for biological network reconstruction, due to its ability to handle noisy data. Even though there are many software packages allowing for Bayesian network reconstruction, only few of them are freely available to researchers. Moreover, they usually require at least basic programming abilities, which restricts their potential user base. Our goal was to provide software which would be freely available, efficient and usable to non-programmers

Crossref

PubMed Central

Finding evolutionarily conserved cis-regulatory modules with a universal set of motifs

Author: Dojer Norbert
Patelak Mateusz
Tiuryn Jerzy
Wilczynski Bartek
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Finding functional regulatory elements in DNA sequences is a very important problem in computational biology and providing a reliable algorithm for this task would be a major step towards understanding regulatory mechanisms on genome-wide scale. Major obstacles in this respect are that the fact that the amount of non-coding DNA is vast, and that the methods for predicting functional transcription factor binding sites tend to produce results with a high percentage of false positives. This makes the problem of finding regions significantly enriched in binding sites difficult. Results We develop a novel method for predicting regulatory regions in DNA sequences, which is designed to exploit the evolutionary conservation of regulatory elements between species without assuming that the order of motifs is preserved across species. We have implemented our method and tested its predictive abilities on various datasets from different organisms. Conclusion We show that our approach enables us to find a majority of the known CRMs using only sequence information from different species together with currently publicly available motif data. Also, our method is robust enough to perform well in predicting CRMs, despite differences in tissue specificity and even across species, provided that the evolutionary distances between compared species do not change substantially. The complexity of the proposed algorithm is polynomial, and the observed running times show that it may be readily applied.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

RECORD: Reference-Assisted Genome Assembly for Closely Related Genomes

Author: Buza Krisztián Antal
Dojer Norbert
Wilczynski Bartek
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Background. Next-generation sequencing technologies are now producing multiple times the genome size in total reads from a single experiment. This is enough information to reconstruct at least some of the differences between the individual genome studied in the experiment and the reference genome of the species. However, in most typical protocols, this information is disregarded and the reference genome is used. Results. We provide a new approach that allows researchers to reconstruct genomes very closely related to the reference genome (e.g., mutants of the same species) directly from the reads used in the experiment. Our approach applies de novo assembly software to experimental reads and so-called pseudoreads and uses the resulting contigs to generate a modified reference sequence. In this way, it can very quickly, and at no additional sequencing cost, generate new, modified reference sequence that is closer to the actual sequenced genome and has a full coverage. In this paper, we describe our approach and test its implementation called RECORD. We evaluate RECORD on both simulated and real data. We made our software publicly available on sourceforge. Conclusion. Our tests show that on closely related sequences RECORD outperforms more general assisted-assembly software

Crossref

Directory of Open Access Journals

PubMed Central

Repository of the Academy's Library

Two new zinc(II) acetates with 3- and 4-aminopyridine

Author: Belaj Ferdinand
Dojer Brina
Kristl Matjaž
Pevec Andrej
Publication venue: 'Slovenian Chemical Society'
Publication date: 25/05/2016
Field of study

The synthesis and characterization of two new zinc(II) coordination compounds with 3- and 4-aminopyridine are reported. They were obtained after adding a water solution of

Zn(CH_3COO)_2

2H_2O

or dissolving solid

Zn(CH_3COO)_2

2H_2O

in methanol solutions of 3- and 4-aminopyridine. The products were characterized structurally by single-crystal X-ray diffraction analysis. Colourless crystals of the compound synthesized by the reaction of

Zn(CH_3COO)_2

2H_2O

and 3-aminopyridine (3-apy), are built of trinuclear complex molecules with the formula

[Zn_3(O_2CCH_3)_6(3- apy)_2(H_2O)_2]

(1). The molecules consists of two terminal

Zn

atoms, coordinated tetrahedrally, and one central

Zn

atom, coordinated octahedrally. Colourless crystals, obtained by the reaction of

Zn(CH_3COO)_2

2H_2O

with 4-aminopyridine (4-apy), consist of a mononuclear complex

[Zn(O_2CCH_3)_2(4-apy)_2]

(2). Hydrogen-bonding interactions in the crystal structures of both complexes are reported.Sintetizirali in karakterizirali smo novi cinkovi koordinacijski spojini s 3- in 4-aminopiridinom. Dobili smo ju z dodajanjem metanolne raztopine

Zn(CH_3COO)_2

2H_2O

v vodno raztopino 3-aminopiridina oziroma raztapljanjem

Zn(CH_3COO)_2

2H_2O

v metanolni raztopini 4-aminopiridina. Produkta sta bila okarakterizirana z rentgensko strukturno analizo monokristalov. Brezbarvni kristali, pridobljeni z reakcijo med

Zn(CH_3COO)_2

2H_2O

in 3-aminopiridinom, so zgrajeni iz trijedrnih koordinacijskih molekul s kemijsko formulo

[Zn_3(O_2CCH_3)_6(3-apy)_2(H_2O)_2]

(1). Molekula je sestavljena iz dveh terminalnih cinkovih ionov, ki sta tetraedično koordinirana, in enega centralnega iona, ki je oktaedrično koordiniran. Brezbarvni kristali, dobljeni z reakcijo med

Zn(CH_3COO)_2

2H2_O

in 4-aminopiridinom, sestojijo iz enojedrnih koordinacijskih molekul s kemijsko formulo

[Zn(O_2CCH_3)_2(4-apy)_2]

(2). Poročamo tudi o vodikovih vezeh v kristalnih strukturah obeh spojin

Digital library of University of Maribor

Repository of the University of Ljubljana

Applying dynamic Bayesian networks to perturbed gene expression data

Author: Dojer Norbert
Gambin Anna
Mizera Andrzej
Tiuryn Jerzy
Wilczyński Bartek
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayesian networks appear attractive in the field of inferring gene interactions structure from microarray experiments data. However, the basic formalism has some disadvantages, e.g. it is sometimes hard to distinguish between the origin and the target of an interaction. Two kinds of microarray experiments yield data particularly rich in information regarding the direction of interactions: time series and perturbation experiments. In order to correctly handle them, the basic formalism must be modified. For example, dynamic Bayesian networks (DBN) apply to time series microarray data. To our knowledge the DBN technique has not been applied in the context of perturbation experiments. RESULTS: We extend the framework of dynamic Bayesian networks in order to incorporate perturbations. Moreover, an exact algorithm for inferring an optimal network is proposed and a discretization method specialized for time series data from perturbation experiments is introduced. We apply our procedure to realistic simulations data. The results are compared with those obtained by standard DBN learning techniques. Moreover, the advantages of using exact learning algorithm instead of heuristic methods are analyzed. CONCLUSION: We show that the quality of inferred networks dramatically improves when using data from perturbation experiments. We also conclude that the exact algorithm should be used when it is possible, i.e. when considered set of genes is small enough

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Repository and Bibliography - Luxembourg

Comparison between Suitable Priors for Additive Bayesian Networks

Author: A Djebbari
A Gelman
AFY Poon
AP Hodges
C Zorn
D Firth
D Heckerman
DV Lindley
E Gutiérrez-Peña
EJG Pitman
FI Lewis
FI Lewis
FI Lewis
M Chen
M Koivisto
M Pittavino
MJ Sanchez-Vazquez
MP Ward
N Dojer
P Diaconis
R Jansen
RW Robinson
S Hartnack
Publication venue
Publication date: 18/09/2018
Field of study

Additive Bayesian networks are types of graphical models that extend the usual Bayesian generalized linear model to multiple dependent variables through the factorisation of the joint probability distribution of the underlying variables. When fitting an ABN model, the choice of the prior of the parameters is of crucial importance. If an inadequate prior - like a too weakly informative one - is used, data separation and data sparsity lead to issues in the model selection process. In this work a simulation study between two weakly and a strongly informative priors is presented. As weakly informative prior we use a zero mean Gaussian prior with a large variance, currently implemented in the R-package abn. The second prior belongs to the Student's t-distribution, specifically designed for logistic regressions and, finally, the strongly informative prior is again Gaussian with mean equal to true parameter value and a small variance. We compare the impact of these priors on the accuracy of the learned additive Bayesian network in function of different parameters. We create a simulation study to illustrate Lindley's paradox based on the prior choice. We then conclude by highlighting the good performance of the informative Student's t-prior and the limited impact of the Lindley's paradox. Finally, suggestions for further developments are provided.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Crossref

ZORA

Recommended from our members

Nucleotide-resolution DNA double-strand breaks mapping by next-generation sequencing

Author: Bienko Magda
Chiarle Roberto
Crosetto Nicola
Dikic Ivan
Dojer Norbert
Ginalski Krzysztof
Karaca Elif
Mitra Abhishek
Pasero Philippe
Rowicka Maga
Silva Maria Joao
Skrzypczak Magdalena
Wang Qi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/03/2014
Field of study

We present a genome-wide method to map DNA double-strand breaks (DSBs) at nucleotide resolution by direct in situ breaks labeling, enrichment on streptavidin, and next-generation sequencing (BLESS). We comprehensively validated and tested BLESS using different human and mouse cells, DSBs-inducing agents, and sequencing platforms. BLESS was able to detect telomere ends, Sce endonuclease-induced DSBs, and complex genome-wide DSBs landscapes. As a proof of principle, we characterized the genomic landscape of sensitivity to replication stress in human cells, and identified over two thousand non-uniformly distributed aphidicolin-sensitive regions (ASRs) overrepresented in genes and enriched in satellite repeats. ASRs were also enriched in regions rearranged in human cancers, with many cancer-associated genes exhibiting high sensitivity to replication stress. Our method is suitable for genome-wide mapping of DSBs in various cells and experimental conditions with a specificity and resolution unachievable by current techniques

Harvard University - DASH

Listen to genes : dealing with microarray data in the frequency domain

Author: A Claridge-Chang
AN Stepanova
AN Stepanova
B-R Kim
Diego Di Bernardo
Dongyun Yi
H Guo
H Ueda
HG McWatters
IP Androulakis
J Fan
J Fan
J Qian
JCW Locke
JH Wu
Jianfeng Feng
MJ Yanovsky
MR Doyle
N Dojer
P DHaeseleer
PO Lim
PT Spellman
R Balasubramaniyan
R Cristi
Ritesh Krishna
S Kim
S Wichert
Shuixia Guo
SL Harmer
SX Guo
U Alon
Vicky Buchanan-Wollaston
W Pan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/04/2009
Field of study

Background: We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes. Methodology: Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail. Conclusions: We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Bayesian approaches to reverse engineer cellular systems: a simulation study on nonlinear Gaussian networks

Author: A Bernard
AA Margolin
AJ Hartemink
BE Perrin
D Husmeier
DE Zak
Fulvia Ferrazzi
GF Cooper
H de Jong
IM Ong
J Yu
K Murphy
KC Chen
M Zou
Marco F Ramoni
N Dojer
N Friedman
N Friedman
N Friedman
N Nariai
P D'haeseleer
P Le Phillip
P Sebastiani
Paola Sebastiani
Riccardo Bellazzi
S Imoto
S Kim
VA Smith
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND. Reverse engineering cellular networks is currently one of the most challenging problems in systems biology. Dynamic Bayesian networks (DBNs) seem to be particularly suitable for inferring relationships between cellular variables from the analysis of time series measurements of mRNA or protein concentrations. As evaluating inference results on a real dataset is controversial, the use of simulated data has been proposed. However, DBN approaches that use continuous variables, thus avoiding the information loss associated with discretization, have not yet been extensively assessed, and most of the proposed approaches have dealt with linear Gaussian models. RESULTS. We propose a generalization of dynamic Gaussian networks to accommodate nonlinear dependencies between variables. As a benchmark dataset to test the new approach, we used data from a mathematical model of cell cycle control in budding yeast that realistically reproduces the complexity of a cellular system. We evaluated the ability of the networks to describe the dynamics of cellular systems and their precision in reconstructing the true underlying causal relationships between variables. We also tested the robustness of the results by analyzing the effect of noise on the data, and the impact of a different sampling time. CONCLUSION. The results confirmed that DBNs with Gaussian models can be effectively exploited for a first level analysis of data from complex cellular systems. The inferred models are parsimonious and have a satisfying goodness of fit. Furthermore, the networks not only offer a phenomenological description of the dynamics of cellular systems, but are also able to suggest hypotheses concerning the causal interactions between variables. The proposed nonlinear generalization of Gaussian models yielded models characterized by a slightly lower goodness of fit than the linear model, but a better ability to recover the true underlying connections between variables.Italian Ministry of University and Scientific Research; National Institutes of Health & National Human Genome Research Institute (HG003354-01A2); Collegio Ghislieri, Pavia Italy fellowshi

CiteSeerX

Crossref

Boston University Institutional Repository (OpenBU)

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central