Search CORE

84 research outputs found

Increasing stability with complexity in a system composed of unstable subsystems

Author: Spouge John L
Publication venue: Published by Elsevier Inc.
Publication date: 30/09/1986
Field of study

AbstractWe examine stability of Hoffman's symmetric model of the immune system ẋ = Si − xi∑j=1n Kji xj; xi > 0; i= 1,2, …, n; (1) where Si > 0, Kij = Kji ⩾ 0. This paper gives necessary and sufficient conditions on {Si} and {Kij} for Eq. (1) to have a unique, stable, steady-state solution. Determining existence of a steady-state solution requires a theorem delimiting the range R of a function F: D ⊆ Rn → R ⊆ Rn, where D is a (possibly proper) subset of Rn. This theorem may be new.If off-diagonal elements {Kij: i ≠ j} are non-zero with probability C and 0 < Smin ⩽ Si ⩽ ϱSmin, ϱ a fixed integer, we let P(n, C) be the probability that Eq. (1) does not have a stable, steady-state solution. Let T(n) = (ϱ + 1)2ϱln nn (2) As n → ∞, CT(n) → r > 1 implies P(n, C) → 0. If we set {Kii = 0; i = 1, 2,…, n}, this result shows that accumulating more unstable subsystems increases the probability of stability of this system

Elsevier - Publisher Connector

Finite-size corrections to Poisson approximations in general renewal-success processes

Author: Spouge John L.
Publication venue: Elsevier Inc.
Publication date: 15/01/2005
Field of study

AbstractConsider a renewal process, and let K⩾0 denote the random duration of a typical renewal cycle. Assume that on any renewal cycle, a rare event called “success” can occur. Such successes lend themselves naturally to approximation by Poisson point processes. If each success occurs after a random delay, however, Poisson convergence can be relatively slow, because each success corresponds to a time interval, not a point. If K is an arithmetic variable, a “finite-size correction” (FSC) is known to speed Poisson convergence by providing a second, subdominant term in the appropriate asymptotic expansion. This paper generalizes the FSC from arithmetic K to general K. Genomics applications require this generalization, because they have already heuristically applied the FSC to p-values involving absolutely continuous distributions. The FSC also sharpens certain results in queuing theory, insurance risk, traffic flow, and reliability theory

Elsevier - Publisher Connector

NEXT-Peak: A Normal-Exponential Two-Peak Model for Peak-Calling in ChIP-seq Data

Author: Jayatillake Rasika V.
Kim Nak-Kyeong
Spouge John L.
Publication venue: ODU Digital Commons
Publication date: 01/01/2013
Field of study

Background: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) can locate transcription factor binding sites on genomic scale. Although many models and programs are available to call peaks, none has dominated its competition in comparison studies. Results: We propose a rigorous statistical model, the normal-exponential two-peak (NEXT-peak) model, which parallels the physical processes generating the empirical data, and which can naturally incorporate mappability information. The model therefore estimates total strength of binding (even if some binding locations do not map uniquely into a reference genome, effectively censoring them); it also assigns an error to an estimated binding location. The comparison study with existing programs on real ChIP-seq datasets (STAT1, NRSF, and ZNF143) demonstrates that the NEXT-peak model performs well both in calling peaks and locating them. The model also provides a goodness-of-fit test, to screen out spurious peaks and to infer multiple binding events in a region. Conclusions: The NEXT-peak program calls peaks on any test dataset about as accurately as any other, but provides unusual accuracy in the estimated location of the peaks it calls. NEXT-peak is based on rigorous statistics, so its model also provides a principled foundation for a more elaborate statistical analysis of ChIP-seq data

Old Dominion University

Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites

Author: Kim Nak-Kyeong
Mariño-Ramírez Leonardo
Spouge John L
Tharakaraman Kannan
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Biologically active sequence motifs often have positional preferences with respect to a genomic landmark. For example, many known transcription factor binding sites (TFBSs) occur within an interval [-300, 0] bases upstream of a transcription start site (TSS). Although some programs for identifying sequence motifs exploit positional information, most of them model it only implicitly and with <it>ad hoc </it>methods, making them unsuitable for general motif searches. Results A-GLAM, a user-friendly computer program for identifying sequence motifs, now incorporates a Bayesian model systematically combining sequence and positional information. A-GLAM's predictions with and without positional information were compared on two human TFBS datasets, each containing sequences corresponding to the interval [-2000, 0] bases upstream of a known TSS. A rigorous statistical analysis showed that positional information significantly improved the prediction of sequence motifs, and an extensive cross-validation study showed that A-GLAM's model was robust against mild misspecification of its parameters. As expected, when sequences in the datasets were successively truncated to the intervals [-1000, 0], [-500, 0] and [-250, 0], positional information aided motif prediction less and less, but never hurt it significantly. Conclusion Although sequence truncation is a viable strategy when searching for biologically active motifs with a positional preference, a probabilistic model (used reasonably) generally provides a superior and more robust strategy, particularly when the sequence motifs' positional preferences are not well characterized.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements

Author: Landsman David
Mariño-Ramírez Leonardo
Sheetlin Sergey L
Spouge John L
Tharakaraman Kannan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. RESULTS: We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. CONCLUSION: Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

False Discovery Rate Controlling Procedures with BLOSUM62 substitution matrix and their application to HIV Data

Author: Aldunate Muriel
Giraldo Mileiy
Kim Kyurhi
Park Dohwan
Park Junyong
Spouge John L.
Tachedjian Gilda
Publication venue
Publication date: 25/11/2023
Field of study

Identifying significant sites in sequence data and analogous data is of fundamental importance in many biological fields. Fisher's exact test is a popular technique, however this approach to sparse count data is not appropriate due to conservative decisions. Since count data in HIV data are typically very sparse, it is crucial to use additional information to statistical models to improve testing power. In order to develop new approaches to incorporate biological information in the false discovery controlling procedure, we propose two models: one based on the empirical Bayes model under independence of amino acids and the other uses pairwise associations of amino acids based on Markov random field with on the BLOSUM62 substitution matrix. We apply the proposed methods to HIV data and identify significant sites incorporating BLOSUM62 matrix while the traditional method based on Fisher's test does not discover any site. These newly developed methods have the potential to handle many biological problems in the studies of vaccine and drug trials and phenotype studies

arXiv.org e-Print Archive

Hemolysis Pathways during Storage of Erythrocytes and Inter-Donor Variability in Erythrocyte Morphology

Author: Bieback Karen
Boecker Clemens
Brenner-Weiss Gerald
Kirschhöfer Frank
Melzak Kathryn A.
Spouge John L.
Publication venue: Karger Publishers
Publication date: 19/08/2020
Field of study

KITopen

Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi

Author: C. André Levesque
Conrad L. Schoch
John L. Spouge
Keith A. Seifert
Sabine Huhndorf
Vincent Robert
Wen Chen
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 28/10/2022
Field of study

UTUPub

Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics

Author: Bamber
Berman
Brenner
Chen
Davis
Eddy
Fawcett
Finn
Gonzalez
Green
Gribskov
Hand
Hyrum D. Carroll
John L. Spouge
Jones
Kann
Krishnamurthy
Landgrebe
Liu
Marchler-Bauer
Maricel G. Kann
Pearson
Raychaudhuri
Schaffer
Schaffer
Sergey L. Sheetlin
Sierk
Swets
Swets
Wass
Wilbur
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Since database retrieval is a fundamental operation, the measurement of retrieval efficacy is critical to progress in bioinformatics. This article points out some issues with current methods of measuring retrieval efficacy and suggests some improvements. In particular, many studies have used the pooled receiver operating characteristic for n irrelevant records (ROCn) score, the area under the ROC curve (AUC) of a ‘pooled’ ROC curve, truncated at n irrelevant records. Unfortunately, the pooled ROCn score does not faithfully reflect actual usage of retrieval algorithms. Additionally, a pooled ROCn score can be very sensitive to retrieval results from as little as a single query

Crossref

Columbus State University

PubMed Central