650 research outputs found
Integrative analysis identifies candidate tumor microenvironment and intracellular signaling pathways that define tumor heterogeneity in NF1
Neurofibromatosis type 1 (NF1) is a monogenic syndrome that gives rise to numerous symptoms including cognitive impairment, skeletal abnormalities, and growth of benign nerve sheath tumors. Nearly all NF1 patients develop cutaneous neurofibromas (cNFs), which occur on the skin surface, whereas 40-60% of patients develop plexiform neurofibromas (pNFs), which are deeply embedded in the peripheral nerves. Patients with pNFs have a ~10% lifetime chance of these tumors becoming malignant peripheral nerve sheath tumors (MPNSTs). These tumors have a severe prognosis and few treatment options other than surgery. Given the lack of therapeutic options available to patients with these tumors, identification of druggable pathways or other key molecular features could aid ongoing therapeutic discovery studies. In this work, we used statistical and machine learning methods to analyze 77 NF1 tumors with genomic data to characterize key signaling pathways that distinguish these tumors and identify candidates for drug development. We identified subsets of latent gene expression variables that may be important in the identification and etiology of cNFs, pNFs, other neurofibromas, and MPNSTs. Furthermore, we characterized the association between these latent variables and genetic variants, immune deconvolution predictions, and protein activity predictions
F-Seq: a feature density estimator for high-throughput sequence tags
Summary: Tag sequencing using high-throughput sequencing technologies are now regularly employed to identify specific sequence features, such as transcription factor binding sites (ChIP-seq) or regions of open chromatin (DNase-seq). To intuitively summarize and display individual sequence data as an accurate and interpretable signal, we developed F-Seq, a software package that generates a continuous tag sequence density estimation allowing identification of biologically meaningful sites whose output can be displayed directly in the UCSC Genome Browser
Modeling Cancer Progression via Pathway Dependencies
Cancer is a heterogeneous disease often requiring a complexity of alterations to drive a normal cell to a malignancy and ultimately to a metastatic state. Certain genetic perturbations have been implicated for initiation and progression. However, to a great extent, underlying mechanisms often remain elusive. These genetic perturbations are most likely reflected by the altered expression of sets of genes or pathways, rather than individual genes, thus creating a need for models of deregulation of pathways to help provide an understanding of the mechanisms of tumorigenesis. We introduce an integrative hierarchical analysis of tumor progression that discovers which a priori defined pathways are relevant either throughout or in particular steps of progression. Pathway interaction networks are inferred for these relevant pathways over the steps in progression. This is followed by the refinement of the relevant pathways to those genes most differentially expressed in particular disease stages. The final analysis infers a gene interaction network for these refined pathways. We apply this approach to model progression in prostate cancer and melanoma, resulting in a deeper understanding of the mechanisms of tumorigenesis. Our analysis supports previous findings for the deregulation of several pathways involved in cell cycle control and proliferation in both cancer types. A novel finding of our analysis is a connection between ErbB4 and primary prostate cancer
Correlation set analysis: detecting active regulators in disease populations using prior causal knowledge
<p>Abstract</p> <p>Background</p> <p>Identification of active causal regulators is a crucial problem in understanding mechanism of diseases or finding drug targets. Methods that infer causal regulators directly from primary data have been proposed and successfully validated in some cases. These methods necessarily require very large sample sizes or a mix of different data types. Recent studies have shown that prior biological knowledge can successfully boost a method's ability to find regulators.</p> <p>Results</p> <p>We present a simple data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and a specific type of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their regulatees, we focus on coherence of regulatees of a regulator. Using simulated datasets we show that our method performs very well at recovering even weak regulatory relationships with a low false discovery rate. Using three separate real biological datasets we were able to recover well known and as yet undescribed, active regulators for each disease population. The results are represented as a rank-ordered list of regulators, and reveals both single and higher-order regulatory relationships.</p> <p>Conclusions</p> <p>CSA is an intuitive data-driven way of selecting directed perturbation experiments that are relevant to a disease population of interest and represent a starting point for further investigation. Our findings demonstrate that combining co-expression analysis on regulatee sets with a literature-derived network can successfully identify causal regulators and help develop possible hypothesis to explain disease progression.</p
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models
Synthetic health data have the potential to mitigate privacy concerns when
sharing data to support biomedical research and the development of innovative
healthcare applications. Modern approaches for data generation based on machine
learning, generative adversarial networks (GAN) methods in particular, continue
to evolve and demonstrate remarkable potential. Yet there is a lack of a
systematic assessment framework to benchmark methods as they emerge and
determine which methods are most appropriate for which use cases. In this work,
we introduce a generalizable benchmarking framework to appraise key
characteristics of synthetic health data with respect to utility and privacy
metrics. We apply the framework to evaluate synthetic data generation methods
for electronic health records (EHRs) data from two large academic medical
centers with respect to several use cases. The results illustrate that there is
a utility-privacy tradeoff for sharing synthetic EHR data. The results further
indicate that no method is unequivocally the best on all criteria in each use
case, which makes it evident why synthetic data generation methods need to be
assessed in context
Recommended from our members
Relative contribution of clinicopathological variables, genomic markers, transcriptomic subtyping and microenvironment features for outcome prediction in stage II/III colorectal cancer
Background: It remains unknown to what extent consensus molecular subtype (CMS) groups and immune-stromal infiltration patterns improve our ability to predict outcomes over tumor-node-metastasis (TNM) staging and microsatellite instability (MSI) status in early-stage colorectal cancer (CRC). Patients and methods: We carried out a comprehensive retrospective biomarker analysis of prognostic markers in adjuvant chemotherapy-untreated (N = 1656) and treated (N = 980), stage II (N = 1799) and III (N = 837) CRCs. We defined CMS scores and estimated CD8+ cytotoxic lymphocytes (CytoLym) and cancer-associated fibroblasts (CAF) infiltration scores from bulk tumor tissue transcriptomes (CMSclassifier and MCPcounter R packages); constructed a stratified multivariable Cox model for disease-free survival (DFS); and calculated the relative proportion of explained variation by each marker (clinicopathological [ClinPath], genomics [Gen: MSI, BRAF and KRAS mutations], CMS scores [CMS] and microenvironment cells [MicroCells: CytoLym+CAF]). Results: In multivariable models, only ClinPath and MicroCells remained significant prognostic factors, with both CytoLym and CAF infiltration scores improving survival prediction beyond other markers. The explained variation for DFS models of ClinPath, MicroCells, Gen markers and CMS4 scores was 77%, 14%, 5.3% and 3.7%, respectively, in stage II; and 55.9%, 35.1%, 4.1% and 0.9%, respectively, in stage III. Patients whose tumors were CytoLym high/CAF low had better DFS than other strata [HR=0.71 (0.6-0.9); P = 0.004]. Microsatellite stable tumors had the strongest signal for improved outcomes with CytoLym high scores (interaction P = 0.04) and the poor prognosis linked to high CAF scores was limited to stage III disease (interaction P = 0.04). Conclusions: Our results confirm that tumor microenvironment infiltration patterns represent potent determinants of the risk for distant dissemination in early-stage CRC. Multivariable models suggest that the prognostic value of MSI and CMS groups is largely explained by CytoLym and CAF infiltration patterns
CRI iAtlas: an interactive portal for immuno-oncology research.
The Cancer Research Institute (CRI) iAtlas is an interactive web platform for data exploration and discovery in the context of tumors and their interactions with the immune microenvironment. iAtlas allows researchers to study immune response characterizations and patterns for individual tumor types, tumor subtypes, and immune subtypes. iAtlas supports computation and visualization of correlations and statistics among features related to the tumor microenvironment, cell composition, immune expression signatures, tumor mutation burden, cancer driver mutations, adaptive cell clonality, patient survival, expression of key immunomodulators, and tumor infiltrating lymphocyte (TIL) spatial maps. iAtlas was launched to accompany the release of the TCGA PanCancer Atlas and has since been expanded to include new capabilities such as (1) user-defined loading of sample cohorts, (2) a tool for classifying expression data into immune subtypes, and (3) integration of TIL mapping from digital pathology images. We expect that the CRI iAtlas will accelerate discovery and improve patient outcomes by providing researchers access to standardized immunogenomics data to better understand the tumor immune microenvironment and its impact on patient responses to immunotherapy
- …