134 research outputs found

    Bayesian regression filter and the issue of priors

    Get PDF
    We propose a Bayesian framework for regression problems, which covers areas which are usually dealt with by function approximation. An online learning algorithm is derived which solves regression problems with a Kalman filter. Its solution always improves with increasing model complexity, without the risk of over-fitting. In the infinite dimension limit it approaches the true Bayesian posterior. The issues of prior selection and over-fitting are also discussed, showing that some of the commonly held beliefs are misleading. The practical implementation is summarised. Simulations using 13 popular publicly available data sets are used to demonstrate the method and highlight important issues concerning the choice of priors

    Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica

    Get PDF
    Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly 'real-time' monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS) shows great promises for using as a routine epidemiological typing tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S. Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic analyses were also compared to PFGE revealing that WGS typing achieved the greater performance than the traditional method. In conclusion, for S. Typhimurium, SNP analysis and nucleotide difference approach of WGS data seem to be the superior methods for epidemiological typing compared to other phylogenetic analytic approaches that may be used on WGS. These approaches were also superior to the more classical typing method, PFGE. Our study also indicates that WGS alone is insufficient to determine whether strains are related or un-related to outbreaks. This still requires the combination of epidemiological data and whole genome sequencing results

    Genetic determinants of co-accessible chromatin regions in activated T cells across humans.

    Get PDF
    Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regions of accessible chromatin (ATAC-peaks) are co-accessible at kilobase and megabase resolution, consistent with the three-dimensional chromatin organization measured by in situ Hi-C in T cells. Fifteen percent of genetic variants located within ATAC-peaks affected the accessibility of the corresponding peak (local-ATAC-QTLs). Local-ATAC-QTLs have the largest effects on co-accessible peaks, are associated with gene expression and are enriched for autoimmune disease variants. Our results provide insights into how natural genetic variants modulate cis-regulatory elements, in isolation or in concert, to influence gene expression

    Reconstructing cancer genomes from paired-end sequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data.</p> <p>Results</p> <p>By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles.</p> <p>Conclusions</p> <p>We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at <url>http://compbio.cs.brown.edu/software/</url>.</p

    Temporal Dissection of K-rasG12D Mutant In Vitro and In Vivo Using a Regulatable K-rasG12D Mouse Allele

    Get PDF
    Animal models which allow the temporal regulation of gene activities are valuable for dissecting gene function in tumorigenesis. Here we have constructed a conditional inducible estrogen receptor-K-rasG12D (ER-K-rasG12D) knock-in mice allele that allows us to temporally switch on or off the activity of K-ras oncogenic mutant through tamoxifen administration. In vitro studies using mice embryonic fibroblast (MEF) showed that a dose of tamoxifen at 0.05 ÂľM works optimally for activation of ER-K-rasG12D independent of the gender status. Furthermore, tamoxifen-inducible activation of K-rasG12D promotes cell proliferation, anchor-independent growth, transformation as well as invasion, potentially via activation of downstream MAPK pathway and cell cycle progression. Continuous activation of K-rasG12D in vivo by tamoxifen treatment is sufficient to drive the neoplastic transformation of normal lung epithelial cells in mice. Tamoxifen withdrawal after the tumor formation results in apoptosis and tumor regression in mouse lungs. Taken together, these data have convincingly demonstrated that K-ras mutant is essential for neoplastic transformation and this animal model may provide an ideal platform for further detailed characterization of the role of K-ras oncogenic mutant during different stages of lung tumorigenesis

    An integrated ChIP-seq analysis platform with customizable workflows

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq), enables unbiased and genome-wide mapping of protein-DNA interactions and epigenetic marks. The first step in ChIP-seq data analysis involves the identification of peaks (i.e., genomic locations with high density of mapped sequence reads). The next step consists of interpreting the biological meaning of the peaks through their association with known genes, pathways, regulatory elements, and integration with other experiments. Although several programs have been published for the analysis of ChIP-seq data, they often focus on the peak detection step and are usually not well suited for thorough, integrative analysis of the detected peaks.</p> <p>Results</p> <p>To address the peak interpretation challenge, we have developed ChIPseeqer, an integrative, comprehensive, fast and user-friendly computational framework for in-depth analysis of ChIP-seq datasets. The novelty of our approach is the capability to combine several computational tools in order to create easily customized workflows that can be adapted to the user's needs and objectives. In this paper, we describe the main components of the ChIPseeqer framework, and also demonstrate the utility and diversity of the analyses offered, by analyzing a published ChIP-seq dataset.</p> <p>Conclusions</p> <p>ChIPseeqer facilitates ChIP-seq data analysis by offering a flexible and powerful set of computational tools that can be used in combination with one another. The framework is freely available as a user-friendly GUI application, but all programs are also executable from the command line, thus providing flexibility and automatability for advanced users.</p

    ZnuA and zinc homeostasis in pseudomonas aeruginosa

    Get PDF
    Pseudomonas aeruginosa is a ubiquitous environmental bacterium and a clinically significant opportunistic human pathogen. Central to the ability of P. aeruginosa to colonise both environmental and host niches is the acquisition of zinc. Here we show that P. aeruginosa PAO1 acquires zinc via an ATP-binding cassette (ABC) permease in which ZnuA is the high affinity, zinc-specific binding protein. Zinc uptake in Gram-negative organisms predominantly occurs via an ABC permease, and consistent with this expectation a P. aeruginosa ΔznuA mutant strain showed an ~60% reduction in cellular zinc accumulation, while other metal ions were essentially unaffected. Despite the major reduction in zinc accumulation, minimal phenotypic differences were observed between the wild-type and ΔznuA mutant strains. However, the effect of zinc limitation on the transcriptome of P. aeruginosa PAO1 revealed significant changes in gene expression that enable adaptation to low-zinc conditions. Genes significantly up-regulated included non-zinc-requiring paralogs of zinc-dependent proteins and a number of novel import pathways associated with zinc acquisition. Collectively, this study provides new insight into the acquisition of zinc by P. aeruginosa PAO1, revealing a hitherto unrecognized complexity in zinc homeostasis that enables the bacterium to survive under zinc limitation.Victoria G. Pederick, Bart A. Eijkelkamp, Stephanie L. Begg, Miranda P. Ween, Lauren J. McAllister, James C. Paton, Christopher A. McDevit

    Genetic variation and exercise-induced muscle damage: implications for athletic performance, injury and ageing.

    Get PDF
    Prolonged unaccustomed exercise involving muscle lengthening (eccentric) actions can result in ultrastructural muscle disruption, impaired excitation-contraction coupling, inflammation and muscle protein degradation. This process is associated with delayed onset muscle soreness and is referred to as exercise-induced muscle damage. Although a certain amount of muscle damage may be necessary for adaptation to occur, excessive damage or inadequate recovery from exercise-induced muscle damage can increase injury risk, particularly in older individuals, who experience more damage and require longer to recover from muscle damaging exercise than younger adults. Furthermore, it is apparent that inter-individual variation exists in the response to exercise-induced muscle damage, and there is evidence that genetic variability may play a key role. Although this area of research is in its infancy, certain gene variations, or polymorphisms have been associated with exercise-induced muscle damage (i.e. individuals with certain genotypes experience greater muscle damage, and require longer recovery, following strenuous exercise). These polymorphisms include ACTN3 (R577X, rs1815739), TNF (-308 G>A, rs1800629), IL6 (-174 G>C, rs1800795), and IGF2 (ApaI, 17200 G>A, rs680). Knowing how someone is likely to respond to a particular type of exercise could help coaches/practitioners individualise the exercise training of their athletes/patients, thus maximising recovery and adaptation, while reducing overload-associated injury risk. The purpose of this review is to provide a critical analysis of the literature concerning gene polymorphisms associated with exercise-induced muscle damage, both in young and older individuals, and to highlight the potential mechanisms underpinning these associations, thus providing a better understanding of exercise-induced muscle damage

    Correlation analysis of the transcriptome of growing leaves with mature leaf parameters in a maize RIL population

    Full text link
    • …
    corecore