16 research outputs found

    Robust regression for periodicity detection in non-uniformly sampled time-course gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In practice many biological time series measurements, including gene microarrays, are conducted at time points that seem to be interesting in the biologist's opinion and not necessarily at fixed time intervals. In many circumstances we are interested in finding targets that are expressed periodically. To tackle the problems of uneven sampling and unknown type of noise in periodicity detection, we propose to use robust regression.</p> <p>Methods</p> <p>The aim of this paper is to develop a general framework for robust periodicity detection and review and rank different approaches by means of simulations. We also show the results for some real measurement data.</p> <p>Results</p> <p>The simulation results clearly show that when the sampling of time series gets more and more uneven, the methods that assume even sampling become unusable. We find that M-estimation provides a good compromise between robustness and computational efficiency.</p> <p>Conclusion</p> <p>Since uneven sampling occurs often in biological measurements, the robust methods developed in this paper are expected to have many uses. The regression based formulation of the periodicity detection problem easily adapts to non-uniform sampling. Using robust regression helps to reject inconsistently behaving data points.</p> <p>Availability</p> <p>The implementations are currently available for Matlab and will be made available for the users of R as well. More information can be found in the web-supplement <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p

    Disambiguate: An open-source application for disambiguating two species in next generation sequencing data from grafted samples [version 2; referees: 3 approved]

    No full text
    Grafting of cell lines and primary tumours is a crucial step in the drug development process between cell line studies and clinical trials. Disambiguate is a program for computationally separating the sequencing reads of two species derived from grafted samples. Disambiguate operates on DNA or RNA-seq alignments to the two species and separates the components at very high sensitivity and specificity as illustrated in artificially mixed human-mouse samples. This allows for maximum recovery of data from target tumours for more accurate variant calling and gene expression quantification. Given that no general use open source algorithm accessible to the bioinformatics community exists for the purposes of separating the two species data, the proposed Disambiguate tool presents a novel approach and improvement to performing sequence analysis of grafted samples. Both Python and C++ implementations are available and they are integrated into several open and closed source pipelines. Disambiguate is open source and is freely available at https://github.com/AstraZeneca-NGS/disambiguate

    Distributed under Creative Commons CC-BY 4.0 Prioritisation of structural variant calls in cancer genomes

    No full text
    ABSTRACT Sensitivity of short read DNA-sequencing for gene fusion detection is improving, but is hampered by the significant amount of noise composed of uninteresting or false positive hits in the data. In this paper we describe a tiered prioritisation approach to extract high impact gene fusion events from existing structural variant calls. Using cell line and patient DNA sequence data we improve the annotation and interpretation of structural variant calls to best highlight likely cancer driving fusions. We also considerably improve on the automated visualisation of the high impact structural variants to highlight the effects of the variants on the resulting transcripts. The resulting framework greatly improves on readily detecting clinically actionable structural variants

    Categorisation of genes.

    No full text
    <p>(A) Venn diagram showing the four comparisons and the overlaps. Group 1 (red) represents the 212 genes induced by LPS at 6 hours. Group 2 (green) represents the 87 genes induced by LPS at 6 hours and inhibited following pan-BET inhibitor JQ1 treatment at 6 hours. Group 3 (blue) represents the 23 genes induced by LPS at both 6 and 24 hours and inhibited following JQ1 treatment at only 24 hours. (B) A table showing the identity (Gene Symbols) of genes in the three groups 1, 2 and 3.</p

    Quantitative PCR validation of selected genes identified by NGS.

    No full text
    <p>Effect of JQ1 treatment on LPS induced expression of signature genes and cytokine control genes in AM from COPD patients. JQ1 treatment lead to inhibited expression of 8 signature genes (CMPK2, EPSTI1, IFI44, IFI44L, IFIT1, IFIT3, MX1 and RSAD2) but had a partial effect on cytokines (IL1a, IL1b, IL6 and IL8) not inhibited at 6h in NGS studies.</p

    M1/M2 polarisation of LPS stimulated genes.

    No full text
    <p>Volcano plot of M1 (red) and M2 (blue) genes at 6 hours (A) and 24 hours (B) after LPS induction. Differential expression of unstimulated vs. LPS stimulated cells was tested using DESeq2 and the Log2 fold change (x-axis) is plotted against the–Log10 adjusted p-values (y-axis). Genes with LFC > 1.5 are labelled by name. The vertical grey lines denotes no LPS induced expression change. The horizontal grey lines shows the 0.05 adjusted p-value cut-off.</p
    corecore