14 research outputs found

    Statistical methods for the inference of interaction networks

    Get PDF

    Exact likelihood computation in Boolean networks with probabilistic time delays, and its application in signal network reconstruction

    Get PDF
    Motivation: For biological pathways, it is common to measure a gene expression time series after various knockdowns of genes that are putatively involved in the process of interest. These interventional time-resolved data are most suitable for the elucidation of dynamic causal relationships in signaling networks. Even with this kind of data it is still a major and largely unsolved challenge to infer the topology and interaction logic of the underlying regulatory network. Results: In this work, we present a novel model-based approach involving Boolean networks to reconstruct small to medium-sized regulatory networks. In particular, we solve the problem of exact likelihood computation in Boolean networks with probabilistic exponential time delays. Simulations demonstrate the high accuracy of our approach. We apply our method to data of Ivanova et al. (2006), where RNA interference knockdown experiments were used to build a network of the key regulatory genes governing mouse stem cell maintenance and differentiation. In contrast to previous analyses of that data set, our method can identify feedback loops and provides new insights into the interplay of some master regulators in embryonic stem cell development. Availability and implementation: The algorithm is implemented in the statistical language R. Code and documentation are available at Bioinformatics online. Contact: [email protected] or [email protected] Supplementary information: Supplementary Materials are available at Bioinfomatics onlin

    Simultaneous characterization of sense and antisense genomic processes by the double-stranded hidden Markov model

    Get PDF
    Hidden Markov models (HMMs) have been extensively used to dissect the genome into functionally distinct regions using data such as RNA expression or DNA binding measurements. It is a challenge to disentangle processes occurring on complementary strands of the same genomic region. We present the double-stranded HMM (dsHMM), a model for the strand-specific analysis of genomic processes. We applied dsHMM to yeast using strand specific transcription data, nucleosome data, and protein binding data for a set of 11 factors associated with the regulation of transcription. The resulting annotation recovers the mRNA transcription cycle (initiation, elongation, termination) while correctly predicting strand-specificity and directionality of the transcription process. We find that pre-initiation complex formation is an essentially undirected process, giving rise to a large number of bidirectional promoters and to pervasive antisense transcription. Notably, 12% of all transcriptionally active positions showed simultaneous activity on both strands. Furthermore, dsHMM reveals that antisense transcription is specifically suppressed by Nrd1, a yeast termination factor

    Optimizing mycobacteria molecular diagnostics: No decontamination! Human DNA depletion? Greener storage at 4 °C!

    Get PDF
    INTRODUCTION Tuberculosis (TB) is an infectious disease caused by the group of bacterial pathogens Mycobacterium tuberculosis complex (MTBC) and is one of the leading causes of death worldwide. Timely diagnosis and treatment of drug-resistant TB is a key pillar of WHO's strategy to combat global TB. The time required to carry out drug susceptibility testing (DST) for MTBC via the classic culture method is in the range of weeks and such delays have a detrimental effect on treatment outcomes. Given that molecular testing is in the range of hours to 1 or 2 days its value in treating drug resistant TB cannot be overstated. When developing such tests, one wants to optimize each step so that tests are successful even when confronted with samples that have a low MTBC load or contain large amounts of host DNA. This could improve the performance of the popular rapid molecular tests, especially for samples with mycobacterial loads close to the limits of detection. Where optimizations could have a more significant impact is for tests based on targeted next generation sequencing (tNGS) which typically require higher quantities of DNA. This would be significant as tNGS can provide more comprehensive drug resistance profiles than the relatively limited resistance information provided by rapid tests. In this work we endeavor to optimize pre-treatment and extraction steps for molecular testing. METHODS We begin by choosing the best DNA extraction device by comparing the amount of DNA extracted by five commonly used devices from identical samples. Following this, the effect that decontamination and human DNA depletion have on extraction efficiency is explored. RESULTS The best results were achieved (i.e., the lowest Ct values) when neither decontamination nor human DNA depletion were used. As expected, in all tested scenarios the addition of decontamination to our workflow substantially reduced the yield of DNA extracted. This illustrates that the standard TB laboratory practice of applying decontamination, although being vital for culture-based testing, can negatively impact the performance of molecular testing. As a complement to the above experiments, we also considered the best Mycobacterium tuberculosis DNA storage method to optimize molecular testing carried out in the near- to medium-term. Comparing Ct values following three-month storage at 4 °C and at -20 °C and showed little difference between the two. DISCUSSION In summary, for molecular diagnostics aimed at mycobacteria this work highlights the importance of choosing the right DNA extraction device, indicates that decontamination causes significant loss of mycobacterial DNA, and shows that samples preserved for further molecular testing can be stored at 4 °C, just as well at -20 °C. Under our experimental settings, human DNA depletion gave no significant improvement in Ct values for the detection of MTBC

    Brownian motors: noisy transport far from equilibrium

    Full text link
    Transport phenomena in spatially periodic systems far from thermal equilibrium are considered. The main emphasize is put on directed transport in so-called Brownian motors (ratchets), i.e. a dissipative dynamics in the presence of thermal noise and some prototypical perturbation that drives the system out of equilibrium without introducing a priori an obvious bias into one or the other direction of motion. Symmetry conditions for the appearance (or not) of directed current, its inversion upon variation of certain parameters, and quantitative theoretical predictions for specific models are reviewed as well as a wide variety of experimental realizations and biological applications, especially the modeling of molecular motors. Extensions include quantum mechanical and collective effects, Hamiltonian ratchets, the influence of spatial disorder, and diffusive transport.Comment: Revised version (Aug. 2001), accepted for publication in Physics Report

    A novel test for independence derived from an exact distribution of ith nearest neighbours.

    Get PDF
    Dependence measures and tests for independence have recently attracted a lot of attention, because they are the cornerstone of algorithms for network inference in probabilistic graphical models. Pearson's product moment correlation coefficient is still by far the most widely used statistic yet it is largely constrained to detecting linear relationships. In this work we provide an exact formula for the [Formula: see text]th nearest neighbor distance distribution of rank-transformed data. Based on that, we propose two novel tests for independence. An implementation of these tests, together with a general benchmark framework for independence testing, are freely available as a CRAN software package (http://cran.r-project.org/web/packages/knnIndep). In this paper we have benchmarked Pearson's correlation, Hoeffding's D, dcor, Kraskov's estimator for mutual information, maximal information criterion and our two tests. We conclude that no particular method is generally superior to all other methods. However, dcor and Hoeffding's D are the most powerful tests for many different types of dependence

    Performance on WHO data.

    No full text
    <p>novelTest.ext denotes our test based on extreme paths, dcor distance covariance and hoeffd Hoeffding's D. All methods were applied to all comparison between pairwise variables which had Pearson's product moment correlation coefficient near zero to exclude linear relationships. Only pairwise complete observations were used as most methods cannot handle missing vallues. All comparisons include ate least 81 datapoints. In total we compare all 3 methods on 2971 variable pairs.</p

    Benchmark of all methods.

    No full text
    <p>cor denotes Pearson's product moment correlation coefficient, dcor distance covariance, hoeffd Hoeffding's D, MIC denotes MIC, novelTest.chisq is our test based on Pearson's test and novelTest.ext is our test based on extreme paths. Each plot shows the power (on the y-axis) against the MI (x-axis). We examine 8 different types of dependence: linear, quadratic, cubic, sine with period , , circle, step function and the dependence called "patchwork copula'' (<b>A–H</b>)</p

    Diagrams explaining Equations 1 and 2 for , and (panel A) and (panel B) with the reference point at coordinates .

    No full text
    <p><b>A</b>: We define 3 regions I, II and III (black, red and blue points respectively). Region I has the least number of constraints and the number of admissible configurations is the number of possibilities to draw points from positions without replacement nor ordering: . The number of admissible configurations for region II is given by the number of rows available and the number of columns which remain to be filled according to . Region III has the remaining points freely distributed, yielding admissible configurations. <b>B</b>: In the case we add an additional region of points exactly at distance (green points). There can be such points. Region I has size and admissible configurations with the number of points strictly inside the square of distance . Region IIa and IIb are symmetric and handled analogous to region II in panel A with and . Region III has admissible configurations analogous to panel A.</p
    corecore