84 research outputs found
CoaSim: A flexible environment for simulating genetic data under coalescent models
BACKGROUND: Coalescent simulations are playing a large role in interpreting large scale intra-specific sequence or polymorphism surveys and for planning and evaluating association studies. Coalescent simulations of data sets under different models can be compared to the actual data to test the importance of different evolutionary factors and thus get insight into these. RESULTS: We have created the CoaSim application as a flexible environment for Monte Carlo simulation of various types of genetic data under equilibrium and non-equilibrium coalescent processes for a variety of applications. Interaction with the tool is through the Guile version of the Scheme scripting language. Scheme scripts for many standard and advanced applications are provided and these can easily be modified by the user for a much wider range of applications. A graphical user interface with less functionality and flexibility is also included. It is primarily intended as an exploratory and educational tool CONCLUSION: CoaSim is a powerful tool because of its flexibility and ease of use. This is illustrated through very varied uses of the application, e.g. evaluation of association mapping methods, parametric bootstrapping, and design and choice of markers for specific question
Scalable Group Level Probabilistic Sparse Factor Analysis
Many data-driven approaches exist to extract neural representations of
functional magnetic resonance imaging (fMRI) data, but most of them lack a
proper probabilistic formulation. We propose a group level scalable
probabilistic sparse factor analysis (psFA) allowing spatially sparse maps,
component pruning using automatic relevance determination (ARD) and subject
specific heteroscedastic spatial noise modeling. For task-based and resting
state fMRI, we show that the sparsity constraint gives rise to components
similar to those obtained by group independent component analysis. The noise
modeling shows that noise is reduced in areas typically associated with
activation by the experimental design. The psFA model identifies sparse
components and the probabilistic setting provides a natural way to handle
parameter uncertainties. The variational Bayesian framework easily extends to
more complex noise models than the presently considered.Comment: 10 pages plus 5 pages appendix, Submitted to ICASSP 1
Clean Colorectum at Diagnostic Colonoscopy:Subsequent Detection of Extracolonic Malignancies by Plasma Protein Biomarkers?
Introduction: Most of the subjects undergoing diagnostic colonoscopy do not have neoplastic bowel lesions. Potentially, some of the symptoms may therefore be caused by extracolonic malignancy, and subjects with persisting symptoms may need subsequent examinations. Blood-based, cancer-associated biomarkers may aid in directing the examinations for other specific malignant diseases. Methods: EDTA plasma samples available from a previous prospective study of subjects undergoing diagnostic colonoscopy were used for analysis of 18 protein biomarkers. The study population of 3732 subjects included 400 patients with colorectal cancer (CRC) and 177 patients with extracolonic malignancies. Univariable analysis of the association of specific biomarkers and extracolonic cancers included those with 10 or more cases. Subsequently, reduced models of 4 or 6 biomarkers, respectively, were established by choosing those with the highest likelihood; age and sex were included as well. Results: Univariable analyses showed that CyFra21-1 had an area under curve (AUC) of 0.87 for lung cancers (n = 33), CA19-9 had an AUC of 0.85 for pancreatic cancer (n = 22), CA125 had an AUC of 0.95 for ovary cancer (n = 16), B2M had an AUC of 0.81 for non-Hodgkin lymphoma (n = 12), and total prostate-specific antigen had an AUC of 0.99 for prostate cancer (n = 10). The multivariable analysis of 4 or 6 biomarkers plus age and sex as explanatory variables showed AUCs of 0.82 to 0.85 both for extracolonic cancers and CRC. The 4 biomarkers included in the model for detection of extracolonic cancers were CA125, hsCRP, CA19-9, and CyFra21-1; the 2 additional for the 6 biomarkers model were CEA and Galectin-3. Similarly, the 4 biomarkers included in the model for detection of CRC were CEA, CyFra21-1, Ferritin, and HE4; the two additional for the 6 biomarkers model were hsCRP and Pepsinogen 2. Conclusions: Results of this study indicate that it may be possible to detect subjects that have an increased risk of extracolonic cancer following a colonoscopy without findings of neoplastic lesions. Combinations of various protein biomarkers may direct subsequent examination after colonoscopy with clean colorectum. The results, although preliminary, may form the basis for additional research directed both for primary examinations of subjects with symptoms of malignancy and subsequent examinations after colonoscopy
Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples
Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
- …