346 research outputs found

    Stationary Distribution and Eigenvalues for a de Bruijn Process

    Full text link
    We define a de Bruijn process with parameters n and L as a certain continuous-time Markov chain on the de Bruijn graph with words of length L over an n-letter alphabet as vertices. We determine explicitly its steady state distribution and its characteristic polynomial, which turns out to decompose into linear factors. In addition, we examine the stationary state of two specializations in detail. In the first one, the de Bruijn-Bernoulli process, this is a product measure. In the second one, the Skin-deep de Bruin process, the distribution has constant density but nontrivial correlation functions. The two point correlation function is determined using generating function techniques.Comment: Dedicated to Herb Wilf on the occasion of his 80th birthda

    Virtual target screening to rapidly identify potential protein targets of natural products in drug discovery

    Get PDF
    Inherent biological viability and diversity of natural products make them a potentially rich source for new therapeutics. However, identification of bioactive compounds with desired therapeutic effects and identification of their protein targets is a laborious, expensive process. Extracts from organism samples may show desired activity in phenotypic assays but specific bioactive compounds must be isolated through further separation methods and protein targets must be identified by more specific phenotypic and in vitro experimental assays. Still, questions remain as to whether all relevant protein targets for a compound have been identified. The desire is to understand breadth of purposing for the compound to maximize its use and intellectual property, and to avoid further development of compounds with insurmountable adverse effects. Previously we developed a Virtual Target Screening system that computationally screens one or more compounds against a collection of virtual protein structures. By scoring each compound-protein interaction, we can compare against averaged scores of synthetic drug-like compounds to determine if a particular protein would be a potential target of a compound of interest. Here we provide examples of natural products screened through our system as we assess advantages and shortcomings of our current system in regards to natural product drug discovery

    Efficacy and safety of left atrial appendage closure in patients with atrial fibrillation and high thromboembolic and bleeding risk

    Get PDF
    Aim. To compare the incidence of thromboembolic and hemorrhagic events after left atrial appendage occlusion (LAAO) or without prevention of thromboembolic events (TEEs) during prospective follow-up of patients with atrial fibrillation (AF) and a high risk of ischemic stroke (IS) who have contraindications to long-term anticoagulant therapy.Material and methods. The study included 134 patients with AF, a high risk of IS, and contraindications to long-term anticoagulation. Patients were divided into 2 groups as follows: the first group included patients who underwent LAAO (n=74), while the second one — those who did not undergo any TEE prevention (n=60). The follow-up period was 3 years. The cumulative rate of all-cause mortality, IS, transient ischemic attacks (TIA), and systemic embolism (SE) was taken as the primary efficacy endpoint. The primary safety endpoint included major bleeding according to GARFIELD registry criteria.Results.  The rate of composite efficacy endpoint in the LAAO group was significantly lower than in the group without thromboembolic prophylaxis (5,2 vs 17,4 per 100 patient-years; adjusted odds ratio (OR), 4,08; 95% confidence interval (CI): 1,7-9,5; p=0,001). The rate of major bleeding was comparable in both groups (2,4 in the LAAO group vs 1,3 per 100 patient-years in the group without thromboembolic prophylaxis; adjusted OR, 0,55; 95% CI: 0,1-3,09; p=0,509). In addition, the event rate of net clinical benefit (all-cause mortality + ischemic stroke/TIA/SE + major bleeding) in the LAAO group was also significantly lower (5,9 vs 18,2 per 100 patient-years; adjusted OR, 3,0; 95% CI: 1,47-6,36; p=0,003).Conclusion. Among patients with AF and contraindications to long-term anticoagulation after 3 years of follow-up, LAAO demonstrated the significant reduction of cumulative rate of all-cause mortality and non-fatal thromboembolic events. At the same time, the frequency of major bleeding was comparable between the groups, even taking into account access-site bleeding and postoperative antithrombotic therapy (ATT)-associated bleeding in the LAAO group. Further randomized clinical trials are required to confirm these data

    Prevention of Cardioembolic Complications in Patients with Atrial Fibrillation: Efficacy and Safety of Left Atrial Appendage Isolation and Oral Anticoagulants

    Get PDF
    Aim. To study the outcomes frequency and structure in patients with atrial fibrillation (AF) depending on the cardioembolic events preventing method: left atrial appendage (LAA) isolation, direct oral anticoagulants (DOACs) or warfarin.Material and methods. A prospective observational study included patients with AF and high risk of cardioembolic complications and without contraindications to anticoagulants. Patients who refused long-term oral anticoagulants taking underwent LAA isolation, the rest of the patients received DOACs or warfarin. The observation period was 3 years. Mortality, cardioembolic complications and major bleeding (according to GARFIELD criteria) cumulative incidence was assessed.Results. We included 245 patients: 46 patients were treated with LAA isolation, 100 with warfarin, and 99 with DOACs. Multivariate regression analysis demonstrated a statistically significant advantage of LAA occluder in terms of combined endpoint achieving frequency compared to warfarin (hazard ratio [HR] 3.10; 95% confidence interval [CI] 1.01-9.54; p=0.049), and to DOACs (HR 3.44, 95% CI 1.15-10.29; p=0.027). A similar result was obtained for all-cause mortality (HR 5.24; 95% CI 1.12-24.55; p=0.036 and HR 5.58; 95% CI 1.22-25.49; p=0.027, respectively). There were no significant differences in bleeding rates between the groups.Conclusion. This observational study demonstrates the superiority of LAA isolation as a first-line therapy over DOACs and warfarin in patients with AF and high risk of cardioembolic complications. Randomized trials are required to confirm these observations

    Viral population estimation using pyrosequencing

    Get PDF
    The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an EM algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.Comment: 23 pages, 13 figure

    Fast motif recognition via application of statistical thresholds

    Get PDF
    Background: Improving the accuracy and efficiency of motif recognition is an important computational challenge that has application to detecting transcription factor binding sites in genomic data. Closely related to motif recognition is the Consensus String decision problem that asks, given a parameter d and a set of ℓ-length strings S = {s1,...,sn}, whether there exists a consensus string that has Hamming distance at most d from any string in S. A set of strings S is pairwise bounded if the Hamming distance between any pair of strings in S is at most 2d. It is trivial to determine whether a set is pairwise bounded, and a set cannot have a consensus string unless it is pairwise bounded. We use Consensus String to determine whether or not a pairwise bounded set has a consensus. Unfortunately, Consensus String is NP-complete. The lack of an efficient method to solve the Consensus String problem has caused it to become a computational bottleneck in MCL-WMR, a motif recognition program capable of solving difficult motif recognition problem instances. Results: We focus on the development of a method for solving Consensus String quickly with a small probability of error. We apply this heuristic to develop a new motif recognition program, sMCL-WMR, which has impressive accuracy and efficiency. We demonstrate the performance of sMCL-WMR in detecting weak motifs in large data sets and in real genomic data sets, and compare the performance to other leading motif recognitio

    Assembly complexity of prokaryotic genomes using short reads

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes.</p> <p>Results</p> <p>We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for <it>de novo </it>reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages).</p> <p>Conclusions</p> <p>Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.</p

    In vitro modeling of tumor interclonal interactions using breast cancer cell lines

    Get PDF
    In the setting of limited resources, natural selection begins to occur between tumor clones. An experimental model of in vitro tumor heterogeneity would allow us to evaluate various types of biological interactions arising from the joint cultivation of phenotypically different tumor clones. Aim: To study the peculiarities of ecological relationships of breast cancer (BC) cell lines MCF-7, BT-474 and MDA-MD-231 under co-culturing conditions. Materials and Methods: Three BC cell lines: luminal A — MCF-7, luminal B — BT-474 and triple-negative — MDA-MD-231 were co-cultured pairwise. Immunocytochemistry was used to differentiate the cell lines in the wells. The effect of the cell-free culture medium on the growth rate of the alternate cell line in the pair was also evaluated. Results: It was shown that when BT-474 cells were co-cultured with MCF-7 and BT-474 cells were co-cultured with MDA-MD-231, two types of ecological interactions could be observed: commensalism and amensalism, respectively. While the cells do not interact with each other in contact, the supernatants of single cultures of MCF-7 and MDAMD-231 exert the same effect on BT-474 as co-cultivation of BT-474 with these cells. Conclusions: The paracrine mechanism of intercellular interaction between different human BC cell lines has been demonstrated. The models used in population ecology can be applicable to identify the types of interaction between cell lines

    A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.</p> <p>Results</p> <p>In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.</p> <p>Conclusion</p> <p>The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.</p

    A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes

    Get PDF
    The reconstruction of ancestral genome architectures and gene orders from homologies between extant species is a long-standing problem, considered by both cytogeneticists and bioinformaticians. A comparison of the two approaches was recently investigated and discussed in a series of papers, sometimes with diverging points of view regarding the performance of these two approaches. We describe a general methodological framework for reconstructing ancestral genome segments from conserved syntenies in extant genomes. We show that this problem, from a computational point of view, is naturally related to physical mapping of chromosomes and benefits from using combinatorial tools developed in this scope. We develop this framework into a new reconstruction method considering conserved gene clusters with similar gene content, mimicking principles used in most cytogenetic studies, although on a different kind of data. We implement and apply it to datasets of mammalian genomes. We perform intensive theoretical and experimental comparisons with other bioinformatics methods for ancestral genome segments reconstruction. We show that the method that we propose is stable and reliable: it gives convergent results using several kinds of data at different levels of resolution, and all predicted ancestral regions are well supported. The results come eventually very close to cytogenetics studies. It suggests that the comparison of methods for ancestral genome reconstruction should include the algorithmic aspects of the methods as well as the disciplinary differences in data aquisition
    corecore