84 research outputs found

    CoaSim: A flexible environment for simulating genetic data under coalescent models

    Get PDF
    BACKGROUND: Coalescent simulations are playing a large role in interpreting large scale intra-specific sequence or polymorphism surveys and for planning and evaluating association studies. Coalescent simulations of data sets under different models can be compared to the actual data to test the importance of different evolutionary factors and thus get insight into these. RESULTS: We have created the CoaSim application as a flexible environment for Monte Carlo simulation of various types of genetic data under equilibrium and non-equilibrium coalescent processes for a variety of applications. Interaction with the tool is through the Guile version of the Scheme scripting language. Scheme scripts for many standard and advanced applications are provided and these can easily be modified by the user for a much wider range of applications. A graphical user interface with less functionality and flexibility is also included. It is primarily intended as an exploratory and educational tool CONCLUSION: CoaSim is a powerful tool because of its flexibility and ease of use. This is illustrated through very varied uses of the application, e.g. evaluation of association mapping methods, parametric bootstrapping, and design and choice of markers for specific question

    Scalable Group Level Probabilistic Sparse Factor Analysis

    Full text link
    Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a group level scalable probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex noise models than the presently considered.Comment: 10 pages plus 5 pages appendix, Submitted to ICASSP 1

    Clean Colorectum at Diagnostic Colonoscopy:Subsequent Detection of Extracolonic Malignancies by Plasma Protein Biomarkers?

    Get PDF
    Introduction: Most of the subjects undergoing diagnostic colonoscopy do not have neoplastic bowel lesions. Potentially, some of the symptoms may therefore be caused by extracolonic malignancy, and subjects with persisting symptoms may need subsequent examinations. Blood-based, cancer-associated biomarkers may aid in directing the examinations for other specific malignant diseases. Methods: EDTA plasma samples available from a previous prospective study of subjects undergoing diagnostic colonoscopy were used for analysis of 18 protein biomarkers. The study population of 3732 subjects included 400 patients with colorectal cancer (CRC) and 177 patients with extracolonic malignancies. Univariable analysis of the association of specific biomarkers and extracolonic cancers included those with 10 or more cases. Subsequently, reduced models of 4 or 6 biomarkers, respectively, were established by choosing those with the highest likelihood; age and sex were included as well. Results: Univariable analyses showed that CyFra21-1 had an area under curve (AUC) of 0.87 for lung cancers (n = 33), CA19-9 had an AUC of 0.85 for pancreatic cancer (n = 22), CA125 had an AUC of 0.95 for ovary cancer (n = 16), B2M had an AUC of 0.81 for non-Hodgkin lymphoma (n = 12), and total prostate-specific antigen had an AUC of 0.99 for prostate cancer (n = 10). The multivariable analysis of 4 or 6 biomarkers plus age and sex as explanatory variables showed AUCs of 0.82 to 0.85 both for extracolonic cancers and CRC. The 4 biomarkers included in the model for detection of extracolonic cancers were CA125, hsCRP, CA19-9, and CyFra21-1; the 2 additional for the 6 biomarkers model were CEA and Galectin-3. Similarly, the 4 biomarkers included in the model for detection of CRC were CEA, CyFra21-1, Ferritin, and HE4; the two additional for the 6 biomarkers model were hsCRP and Pepsinogen 2. Conclusions: Results of this study indicate that it may be possible to detect subjects that have an increased risk of extracolonic cancer following a colonoscopy without findings of neoplastic lesions. Combinations of various protein biomarkers may direct subsequent examination after colonoscopy with clean colorectum. The results, although preliminary, may form the basis for additional research directed both for primary examinations of subjects with symptoms of malignancy and subsequent examinations after colonoscopy

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    Full text link
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
    corecore